Re: dismax query parser crash on double dash

2008-06-03 Thread Bram de Jong
On Mon, Jun 2, 2008 at 11:15 PM, Sean Timm [EMAIL PROTECTED] wrote:
 It seems that the DisMaxRequestHandler tries hard to handle any query that
 the user can throw at it.

That's exactly why I was reporting it... :-)


 - Bram


Re: dismax query parser crash on double dash

2008-06-03 Thread Grant Ingersoll

+1.  Fault tolerance good.  ParseExceptions bad.

Can you open a JIRA issue for it?  If you feel you see the problem, a  
patch would be great, too.


-Grant

On Jun 2, 2008, at 5:15 PM, Sean Timm wrote:

It seems that the DisMaxRequestHandler tries hard to handle any  
query that the user can throw at it.


From http://wiki.apache.org/solr/DisMaxRequestHandler:
Quotes can be used to group phrases, and +/- can be used to denote  
mandatory and optional clauses ... but all other Lucene query parser  
special characters are escaped to simplify the user experience.  The  
handler takes responsibility for building a good query from the  
user's input [...] any query containing an odd number of quote  
characters is evaluated as if there were no quote characters at all.


Would it be outside the scope of the DisMaxRequestHandler to also  
handle improper use of +/-?  There are a couple of other cases where  
a user query could fail to parse.  Basically they all boil down to a  
+ or - operator not being followed by a term.  A few examples of  
queries that fail:


chocolate cookie -
chocolate -+cookie
chocolate --cookie
chocolate - - cookie

-Sean

Grant Ingersoll wrote:

See http://wiki.apache.org/solr/DisMaxRequestHandler

Namely, - is the prohibited operator, thus, -- really is  
meaningless.  You either need to escape them or remove them


-Grant

On Jun 2, 2008, at 7:14 AM, Bram de Jong wrote:


hello all,


just a small note to say that the dismax query parser crashes on:

q = apple -- pear

I'm running through a stored batch of my users' searches and it went
down on the double dash :)


- Bram

--
http://freesound.iua.upf.edu
http://www.smartelectronix.com
http://www.musicdsp.org


--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









--
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ









Re: dismax query parser crash on double dash

2008-06-03 Thread Sean Timm
I can take a stab at this.  I need to see why SOLR-502 isn't working for 
Otis first though.


-Sean

Bram de Jong wrote:

On Tue, Jun 3, 2008 at 1:26 PM, Grant Ingersoll [EMAIL PROTECTED] wrote:
  

+1.  Fault tolerance good.  ParseExceptions bad.

Can you open a JIRA issue for it?  If you feel you see the problem, a patch
would be great, too.



https://issues.apache.org/jira/browse/SOLR-589

I hope the bug report is detailed enough.
As I have no experience whatsoever with Java, me writing a patch would
be a Bad Idea (TM)


 - Bram
  


Re: dismax query parser crash on double dash

2008-06-03 Thread Bram de Jong
On Tue, Jun 3, 2008 at 1:26 PM, Grant Ingersoll [EMAIL PROTECTED] wrote:
 +1.  Fault tolerance good.  ParseExceptions bad.

 Can you open a JIRA issue for it?  If you feel you see the problem, a patch
 would be great, too.

https://issues.apache.org/jira/browse/SOLR-589

I hope the bug report is detailed enough.
As I have no experience whatsoever with Java, me writing a patch would
be a Bad Idea (TM)


 - Bram


Re: dismax query parser crash on double dash

2008-06-03 Thread Bram de Jong
On Tue, Jun 3, 2008 at 3:51 PM, Sean Timm [EMAIL PROTECTED] wrote:
 I can take a stab at this.  I need to see why SOLR-502 isn't working for
 Otis first though.

I slightly enhanced my script so it would only do the strange
searches my users have done in the past... (i.e things with more than
just numbers and letters), and I found two more:

 and 

i.e. one double quote and two double quotes

I'll add it to the ticket.

 - bram


RE: Issuing queries during analysis?

2008-06-03 Thread Dallan Quass
 Grant Ingersoll wrote:
 
 How often does your collection change or get updated?
 
 You could also have a slight alternative, which is to create 
 a real small and simple Lucene index that contains your 
 translations and then do it pre-indexing.  The code for such 
 a searcher is quite simple, albeit it isn't Solr.
 
 Otherwise, you'd have to hack the SolrResourceLoader to 
 recognize your Analyzer as being SolrCoreAware, but, geez, I 
 don't know what the full ramifications of that would be, so 
 caveat emptor.


 Mike Klaas wrote:

 Perhaps you could separate the problem, putting this info in 
 separate index or solr core.

This sounds like the best approach.  I've written a special searcher that
handles standardization requests for multiple places in one http call and it
was pretty straightforward.  That's what I love about SOLR, it's *so* easy
to write plugins for.

Thank-you for your suggestions!

--dallan



Ideas on how to implement sponsored results

2008-06-03 Thread climbingrose
Hi all,

I'm trying to implement sponsored results in Solr search results similar
to that of Google. We index products from various sites and would like to
allow certain sites to promote their products. My approach is to query a
slave instance to get sponsored results for user queries in addition to the
normal search results. This part is easy. However, since the number of
products indexed for each sites can be very different (100, 1000, 1 or
6 products), we need a way to fairly distribute the sponsored results
among sites.

My initial thought is utilising field collapsing patch to collapse the
search results on siteId field. You can imagine that this will create a
series of buckets of results, each bucket representing results from a
site. After that, 2 or 3 buckets will randomly be selected from which I will
randomly select one or two results from. However, since I want these
sponsored results to be relevant to user queries, I'd like only want to have
the first 30 results in each buckets.

Obviously, it's desirable that if the user refreshes the page, new sponsored
results will be displayed. On the other hand, I also want to have the
advantages of Solr cache.

What would be the best way to implement this functionality? Thanks.

Cheers,
Cuong


Re: Ideas on how to implement sponsored results

2008-06-03 Thread Alexander Ramos Jardim
Cuong,

I have implemented sponsored words for a client. I don't know if my working
can help you but I will expose it and let you decide.

I have an index containing products entries that I created a field called
sponsored words. What I do is to boost this field , so when these words are
matched in the query that products appear first on my result.

2008/6/3 climbingrose [EMAIL PROTECTED]:

 Hi all,

 I'm trying to implement sponsored results in Solr search results similar
 to that of Google. We index products from various sites and would like to
 allow certain sites to promote their products. My approach is to query a
 slave instance to get sponsored results for user queries in addition to the
 normal search results. This part is easy. However, since the number of
 products indexed for each sites can be very different (100, 1000, 1 or
 6 products), we need a way to fairly distribute the sponsored results
 among sites.

 My initial thought is utilising field collapsing patch to collapse the
 search results on siteId field. You can imagine that this will create a
 series of buckets of results, each bucket representing results from a
 site. After that, 2 or 3 buckets will randomly be selected from which I
 will
 randomly select one or two results from. However, since I want these
 sponsored results to be relevant to user queries, I'd like only want to
 have
 the first 30 results in each buckets.

 Obviously, it's desirable that if the user refreshes the page, new
 sponsored
 results will be displayed. On the other hand, I also want to have the
 advantages of Solr cache.

 What would be the best way to implement this functionality? Thanks.

 Cheers,
 Cuong




-- 
Alexander Ramos Jardim


sp.dictionary.threshold parm of spell checker seems unresponsive

2008-06-03 Thread Ronald K. Braun
I'm playing around with the spell checker on 1.3 nightly build and
don't see any effect on changes to the sp.dictionary.threshold in
terms of dictionary size.  A value of 0.0 seems to create a dictionary
of the same size and content as a value of 0.9.  (I'd expect a very
small dictionary in the latter case.)  I think sp.dictionary.threshold
is a float parameter, but maybe I'm misunderstanding?

And just to be sure, I assume I can alter this parameter prior to
issue the rebuild command to build the dictionary -- I don't need to
reindex termSourceField between changes?

My solrconfig.xml has this definition for the handler:

requestHandler name=spellchecker
class=solr.SpellCheckerRequestHandler startup=lazy
lst name=defaults
int name=sp.query.suggestionCount30/int
float name=sp.query.accuracy0.5/float
/lst
str name=sp.dictionary.indexDirspell/str
str name=termSourceFielddictionary/str
float name=sp.dictionary.threshold0.9/float
/requestHandler

And schema.xml in case that is somehow relevant:

fieldType name=spell class=solr.TextField positionIncrementGap=100
analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StandardFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StandardFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
/fieldType

field name=dictionary type=spell indexed=true stored=false
multiValued=true omitNorms=true /

Any advice?  I'd definitely like to tighten up the dictionary but it
appears to always include terms regardless of their frequency in the
source content.

Thanks,

Ron


Problems using multicore

2008-06-03 Thread Alexander Ramos Jardim
Hello,

I am getting problems running Solr-1.3-trunk with multicores.

My multicore.xml file is:
multicore adminPath=/admin/multicore persistent=true sharedLib=lib
core name=idxArticle instanceDir=idxArticle
dataDir=idxArticle/data /
core name=idxItem instanceDir=idxItem dataDir=idxItem/data
default=true /
/multicore

I have solr.home pointing the directory containing it.

All the involved directories exist. There's a conf directory containing the
schema.xml and solrconfig.xml of each core in their respective core
directories.

When I try to run solr I get:
An error occurred during activation of changes, please see the log for
details.  [image: Message icon - Error] [HTTP:101216]Servlet: SolrServer
failed to preload on startup in Web application: apache-solr-1.3-dev.war.
org.apache.solr.common.SolrException: error creating core at
org.apache.solr.core.SolrCore.getSolrCore(SolrCore.java:306) at
org.apache.solr.servlet.SolrServlet.init(SolrServlet.java:46) at
javax.servlet.GenericServlet.init(GenericServlet.java:241) at
weblogic.servlet.internal.StubSecurityHelper$ServletInitAction.run(StubSecurityHelper.java:282)
at
weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
at weblogic.security.service.SecurityManager.runAs(Unknown Source) at
weblogic.servlet.internal.StubSecurityHelper.createServlet(StubSecurityHelper.java:63)
at
weblogic.servlet.internal.StubLifecycleHelper.createOneInstance(StubLifecycleHelper.java:58)
at
weblogic.servlet.internal.StubLifecycleHelper.(StubLifecycleHelper.java:48)
at
weblogic.servlet.internal.ServletStubImpl.prepareServlet(ServletStubImpl.java:507)
at
weblogic.servlet.internal.WebAppServletContext.preloadServlet(WebAppServletContext.java:1853)
at
weblogic.servlet.internal.WebAppServletContext.loadServletsOnStartup(WebAppServletContext.java:1830)
at
weblogic.servlet.internal.WebAppServletContext.preloadResources(WebAppServletContext.java:1750)
at
weblogic.servlet.internal.WebAppServletContext.start(WebAppServletContext.java:2909)
at
weblogic.servlet.internal.WebAppModule.startContexts(WebAppModule.java:973)
at weblogic.servlet.internal.WebAppModule.start(WebAppModule.java:361) at
weblogic.application.internal.flow.ModuleStateDriver$3.next(ModuleStateDriver.java:204)
at
weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:26)
at
weblogic.application.internal.flow.ModuleStateDriver.start(ModuleStateDriver.java:60)
at
weblogic.application.internal.flow.ScopedModuleDriver.start(ScopedModuleDriver.java:200)
at
weblogic.application.internal.flow.ModuleListenerInvoker.start(ModuleListenerInvoker.java:117)
at
weblogic.application.internal.flow.ModuleStateDriver$3.next(ModuleStateDriver.java:204)
at
weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:26)
at
weblogic.application.internal.flow.ModuleStateDriver.start(ModuleStateDriver.java:60)
at
weblogic.application.internal.flow.StartModulesFlow.activate(StartModulesFlow.java:27)
at
weblogic.application.internal.BaseDeployment$2.next(BaseDeployment.java:635)
at
weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:26)
at
weblogic.application.internal.BaseDeployment.activate(BaseDeployment.java:212)
at
weblogic.application.internal.DeploymentStateChecker.activate(DeploymentStateChecker.java:154)
at
weblogic.deploy.internal.targetserver.AppContainerInvoker.activate(AppContainerInvoker.java:80)
at
weblogic.deploy.internal.targetserver.operations.AbstractOperation.activate(AbstractOperation.java:566)
at
weblogic.deploy.internal.targetserver.operations.ActivateOperation.activateDeployment(ActivateOperation.java:136)
at
weblogic.deploy.internal.targetserver.operations.ActivateOperation.doCommit(ActivateOperation.java:104)
at
weblogic.deploy.internal.targetserver.operations.AbstractOperation.commit(AbstractOperation.java:320)
at
weblogic.deploy.internal.targetserver.DeploymentManager.handleDeploymentCommit(DeploymentManager.java:816)
at
weblogic.deploy.internal.targetserver.DeploymentManager.activateDeploymentList(DeploymentManager.java:1223)
at
weblogic.deploy.internal.targetserver.DeploymentManager.handleCommit(DeploymentManager.java:434)
at
weblogic.deploy.internal.targetserver.DeploymentServiceDispatcher.commit(DeploymentServiceDispatcher.java:161)
at
weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.doCommitCallback(DeploymentReceiverCallbackDeliverer.java:181)
at
weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.access$100(DeploymentReceiverCallbackDeliverer.java:12)
at
weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer$2.run(DeploymentReceiverCallbackDeliverer.java:67)
at
weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:464)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:200) at
weblogic.work.ExecuteThread.run(ExecuteThread.java:172) Caused by:

Re: sp.dictionary.threshold parm of spell checker seems unresponsive

2008-06-03 Thread Otis Gospodnetic
Ron,

It might be better for you to look at SOLR-572 issue in Solr's JIRA and use the 
patch provided there with the Solr trunk.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
 From: Ronald K. Braun [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Tuesday, June 3, 2008 1:29:01 PM
 Subject: sp.dictionary.threshold parm of spell checker seems unresponsive
 
 I'm playing around with the spell checker on 1.3 nightly build and
 don't see any effect on changes to the sp.dictionary.threshold in
 terms of dictionary size.  A value of 0.0 seems to create a dictionary
 of the same size and content as a value of 0.9.  (I'd expect a very
 small dictionary in the latter case.)  I think sp.dictionary.threshold
 is a float parameter, but maybe I'm misunderstanding?
 
 And just to be sure, I assume I can alter this parameter prior to
 issue the rebuild command to build the dictionary -- I don't need to
 reindex termSourceField between changes?
 
 My solrconfig.xml has this definition for the handler:
 
 
 class=solr.SpellCheckerRequestHandler startup=lazy
 
 30
 0.5
 
 spell
 dictionary
 0.9
 
 
 And schema.xml in case that is somehow relevant:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 multiValued=true omitNorms=true /
 
 Any advice?  I'd definitely like to tighten up the dictionary but it
 appears to always include terms regardless of their frequency in the
 source content.
 
 Thanks,
 
 Ron



Re: dismax query parser crash on double dash

2008-06-03 Thread Otis Gospodnetic
Bram,
You will slowly discover various characters and tokens that don't work with 
DisMax.  They don't work because they are special - they are a part of the 
query grammar and have special meanings.  Have you tried escaping those 
characters in your application before sending the query to Solr?  Escaping is 
done with backward slashes.  I bet that's in the Lucene FAQ on Lucene's Wiki.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
 From: Bram de Jong [EMAIL PROTECTED]
 To: solr-user@lucene.apache.org
 Sent: Tuesday, June 3, 2008 11:15:06 AM
 Subject: Re: dismax query parser crash on double dash
 
 On Tue, Jun 3, 2008 at 3:51 PM, Sean Timm wrote:
  I can take a stab at this.  I need to see why SOLR-502 isn't working for
  Otis first though.
 
 I slightly enhanced my script so it would only do the strange
 searches my users have done in the past... (i.e things with more than
 just numbers and letters), and I found two more:
 
  and 
 
 i.e. one double quote and two double quotes
 
 I'll add it to the ticket.
 
 - bram



Solrj + Multicore

2008-06-03 Thread Alexander Ramos Jardim
Is there a way to access a specific core via Solrj.
Sorry but I couldn't find anything on wiki or google.

-- 
Alexander Ramos Jardim


Re: Solrj + Multicore

2008-06-03 Thread Erik Hatcher


On Jun 3, 2008, at 3:52 PM, Alexander Ramos Jardim wrote:

Is there a way to access a specific core via Solrj


Yes, depending on which SolrServer implementation:

  SolrServer server = new CommonsHttpSolrServer(http://localhost:8983/solr/ 
corename)


-or-

  SolrServer server = new EmbeddedSolrServer(solrCore)

Erik




Re: Solrj + Multicore

2008-06-03 Thread Alexander Ramos Jardim
Well,

This way I connect to my server
new CommonsHttpSolrServer(http://localhost:8983/solr/?core=idxItem;)

This way I don't connect:
new CommonsHttpSolrServer(http://localhost:8983/solr/idxItem;)

As you can obviously see, I can't use the first way because it produces
wrong requests like
http://localhost:8983/solr/?core=idxItem/update?wt=xmlversion=2.2

and I end up getting exceptions like these.

org.apache.solr.common.SolrException: Can_not_find_core_idxItemupdatewtxml

Can_not_find_core_idxItemupdatewtxml

request: http://localhost:8983/solr/?core=idxItem/update?wt=xmlversion=2.2
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:308)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:152)
at
org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:220)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:51)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:55)
...

I would like to remeber that I am using solr-1.3 trunk


2008/6/3 Erik Hatcher [EMAIL PROTECTED]:


 On Jun 3, 2008, at 3:52 PM, Alexander Ramos Jardim wrote:

 Is there a way to access a specific core via Solrj


 Yes, depending on which SolrServer implementation:

  SolrServer server = new CommonsHttpSolrServer(
 http://localhost:8983/solr/corename)

 -or-

  SolrServer server = new EmbeddedSolrServer(solrCore)

Erik





-- 
Alexander Ramos Jardim


Re: solr slave configuration help

2008-06-03 Thread Gaku Mak

Hi Yonik and others,

We ended up using adding 2 additional GB to physical ram (total of 4GB now)
and set the java heap to 3GB, so OS should have 1GB to play with.  The slave
servers are now a lot more responsive, even during replication and with
autowarm turned on (not too aggressive though).  

It also seems like using the serialGC makes the server more responsive
during the time of replication (registering new searcher) unless I'm missing
something in the GC configuration.  

Thanks!

-Gaku


Yonik Seeley wrote:
 
 On Sun, Jun 1, 2008 at 5:20 AM, Gaku Mak [EMAIL PROTECTED] wrote:
 [...]
 I also have some test script to query against the slave server; however,
 whenever during snapinstall, OOM would occur and the server is not very
 responsive (even with autowarm disabled).  After a while (like couple
 minutes), the server can respond again.  Is this expected?
 
 Not really expected, no.
 Is the server unresponsive to a single search request (i.e. it takes a
 long time to complete)?
 Are you load testing, or just trying single requests?
 
 I have set the heap size to 1.5GB out of the 2GB physical ram.  Any help
 is
 appreciated.  Thanks!
 
 Try a smaller heap.
 The OS needs memory to cache the Lucene index structures too (Lucene
 does very little caching and depends on the OS to do it for good
 performance).
 
 
 -Yonik
 
 

-- 
View this message in context: 
http://www.nabble.com/solr-slave-configuration-help-tp17583642p17636257.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: How to describe 2 entities in dataConfig for the DataImporter?

2008-06-03 Thread Julio Castillo
Hi Noble,
I had forgotten to also list comboId as a uniqueKey in the schema.xml file.
But that didn't make a difference.
It still complained about the Document [null] missing required field: id
for each row it ran into of the outer entity.

If you look at the debug output of the entity:pets (see below on original
message).
The query looks like this:
SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1'

This is the problem lies, because, the owner_id in the pets table is
currently a number and thus will not match the modified combo id generated
for the owners' id column.

So, somehow, I need to be able to either remove the 'owners-' suffix before
comparing, or append the same suffix to the pets.owner_id value prior to
comparing.

Thanks

** julio

-Original Message-
From: Noble Paul ??? ?? [mailto:[EMAIL PROTECTED] 
Sent: Monday, June 02, 2008 9:20 PM
To: solr-user@lucene.apache.org
Subject: Re: How to describe 2 entities in dataConfig for the DataImporter?

hi Julio,
delete my previous response. In your schema , 'id' is the uniqueKey.
make  'comboid' the unique key. Because that is the target field name coming
out of the entity 'owners'

--Noble

On Tue, Jun 3, 2008 at 9:46 AM, Noble Paul ??? ??
[EMAIL PROTECTED] wrote:
 The field 'id' is repeated for pet also rename it to something else 
 say  entity name=pets pk=id
   query=SELECT id,name,birth_date,type_id FROM pets WHERE 
 owner_id='${owners.id}'
   parentDeltaQuery=SELECT id FROM owners WHERE 
 id=${pets.owner_id}
   field column=id  name=petid/
 /entity

 --Noble

 On Tue, Jun 3, 2008 at 3:28 AM, Julio Castillo [EMAIL PROTECTED]
wrote:
 Shalin,
 I experimented with it, and the null pointer exception has been taken 
 care of. Thank you.

 I have a different problem now. I believe it is a 
 syntax/specification problem.

 When importing data, I got the following exceptions:
 SEVERE: Exception while adding:
 SolrInputDocumnt[{comboId=comboId(1.0)={owners-9},
 userName=userName(1.0)={[David, Schroeder]}}]

 org.apache.solr.common.SolrException: Document [null] missing 
 required
 field: id
at

org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:289)
at
 org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImp
 ortHand
 ler.java:263)
...

 The problem arises the moment I try to include nested entities (e.g. 
 pets -the problem does not occur if I don't use the transformer, but 
 I have to use the transformer because other unrelated entities also have
id's).
 My data config file looks as follows.

 dataConfig
  document name=doc-1
entity name=owners pk=id
query=select id,first_name,last_name FROM owners
transformer=TemplateTransformer
field column=id  name=comboId
template=owners-${owners.id}/
field column=first_name name=userName/
field column=last_name  name=userName/

entity name=pets pk=id
query=SELECT id,name,birth_date,type_id FROM pets 
 WHERE owner_id='${owners.id}'
parentDeltaQuery=SELECT id FROM owners WHERE 
 id=${pets.owner_id}
field column=id  name=id/
field column=namename=name/
field column=birth_date name=birthDate/
/entity
/entity
  /document
 /dataConfig

 The debug output of the data import looks as follows:

 
 - lst name=verbose-output
  - lst name=entity:owners
- lst name=document#1
  str name=queryselect id,first_name,last_name FROM owners/str
  str name=time-taken0:0:0.15/str
  str--- row #1-/str
  int name=id1/int
  str name=first_nameGeorge/str
  str name=last_nameFranklin/str
  str-/str
  - lst name=transformer:TemplateTransformer
str-/str
str name=idowners-1/str
str name=first_nameGeorge/str
str name=last_nameFranklin/str
str-/str
- lst name=entity:pets
  str name=querySELECT id,name,birth_date,type_id FROM 
 pets WHERE owner_id='owners-1'/str
  str name=time-taken0:0:0.0/str
  /lst
  /lst
  /lst
 + lst name=document#1
 

 Thanks again

 ** julio


 -Original Message-
 From: Shalin Shekhar Mangar [mailto:[EMAIL PROTECTED]
 Sent: Saturday, May 31, 2008 10:26 AM
 To: solr-user@lucene.apache.org
 Subject: Re: How to describe 2 entities in dataConfig for the
DataImporter?

 Hi Julio,

 I've fixed the bug, can you please replace the exiting 
 TemplateTransformer.java in the SOLR-469.patch and use the attached 
 TemplateTransformer.java file. We'll add the changes to our next patch.
 Sorry for all the trouble.

 On Sat, May 31, 2008 at 10:31 PM, Noble Paul ??? ??
 [EMAIL PROTECTED] wrote:
 julio,
 Looks like it is a bug.
 We can give u a new 

Re: Ideas on how to implement sponsored results

2008-06-03 Thread climbingrose
Hi Alexander,

Thanks for your suggestion. I think my problem is a bit different from
yours. We don't have any sponsored words but we have to retrieve sponsored
results directly from the index. This is because a site can have 60,000
products which is hard to insert/update keywords. I can live with that by
issuing a separate query to fetch sponsored results. My problem is to
equally distribute sponsored results between sites so that each site will
have an opportunity to show their sponsored results no matter how many
products they have. For example, if site A has 6 products, site B has
only 2000 then sponsored products from site B will have a very small chance
to be displayed.


On Wed, Jun 4, 2008 at 2:56 AM, Alexander Ramos Jardim 
[EMAIL PROTECTED] wrote:

 Cuong,

 I have implemented sponsored words for a client. I don't know if my working
 can help you but I will expose it and let you decide.

 I have an index containing products entries that I created a field called
 sponsored words. What I do is to boost this field , so when these words are
 matched in the query that products appear first on my result.

 2008/6/3 climbingrose [EMAIL PROTECTED]:

  Hi all,
 
  I'm trying to implement sponsored results in Solr search results
 similar
  to that of Google. We index products from various sites and would like to
  allow certain sites to promote their products. My approach is to query a
  slave instance to get sponsored results for user queries in addition to
 the
  normal search results. This part is easy. However, since the number of
  products indexed for each sites can be very different (100, 1000, 1
 or
  6 products), we need a way to fairly distribute the sponsored results
  among sites.
 
  My initial thought is utilising field collapsing patch to collapse the
  search results on siteId field. You can imagine that this will create a
  series of buckets of results, each bucket representing results from a
  site. After that, 2 or 3 buckets will randomly be selected from which I
  will
  randomly select one or two results from. However, since I want these
  sponsored results to be relevant to user queries, I'd like only want to
  have
  the first 30 results in each buckets.
 
  Obviously, it's desirable that if the user refreshes the page, new
  sponsored
  results will be displayed. On the other hand, I also want to have the
  advantages of Solr cache.
 
  What would be the best way to implement this functionality? Thanks.
 
  Cheers,
  Cuong
 



 --
 Alexander Ramos Jardim




-- 
Regards,

Cuong Hoang


Re: Solrj + Multicore

2008-06-03 Thread Ryan McKinley



This way I don't connect:
new CommonsHttpSolrServer(http://localhost:8983/solr/idxItem;)



this is how you need to connect... otherwise nothing will work.

Perhaps we should throw an exception if you initialize a URL that  
contains ?


ryan



Re: How to describe 2 entities in dataConfig for the DataImporter?

2008-06-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi julio,
You must create an extra field for 'comboid' because you really need
the 'id' for your sub-entities. Your data-config must look as follows.
The pet also has a field called 'id' . It is not a good idea. call it
'petid' or something (both in dataconfig and schema.xml). Please make
sure that the field names are unique .


entity name=owners pk=id
   query=select id,first_name,last_name FROM owners
   transformer=TemplateTransformer
   field column=comboId  template=owners-${owners.id}/
   field column=id /
   field column=first_name name=userName/
   field column=last_name  name=userName/

   entity name=pets pk=id
   query=SELECT id,name,birth_date,type_id FROM pets WHERE
owner_id='${owners.id}'
   parentDeltaQuery=SELECT id FROM owners WHERE
id=${pets.owner_id}
   field column=id  name=id/
   field column=namename=name/
   field column=birth_date name=birthDate/
   /entity
   /entity


On Wed, Jun 4, 2008 at 5:50 AM, Julio Castillo [EMAIL PROTECTED] wrote:
 Hi Noble,
 I had forgotten to also list comboId as a uniqueKey in the schema.xml file.
 But that didn't make a difference.
 It still complained about the Document [null] missing required field: id
 for each row it ran into of the outer entity.

 If you look at the debug output of the entity:pets (see below on original
 message).
 The query looks like this:
 SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1'

 This is the problem lies, because, the owner_id in the pets table is
 currently a number and thus will not match the modified combo id generated
 for the owners' id column.

 So, somehow, I need to be able to either remove the 'owners-' suffix before
 comparing, or append the same suffix to the pets.owner_id value prior to
 comparing.

 Thanks

 ** julio

 -Original Message-
 From: Noble Paul ??? ?? [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 02, 2008 9:20 PM
 To: solr-user@lucene.apache.org
 Subject: Re: How to describe 2 entities in dataConfig for the DataImporter?

 hi Julio,
 delete my previous response. In your schema , 'id' is the uniqueKey.
 make  'comboid' the unique key. Because that is the target field name coming
 out of the entity 'owners'

 --Noble

 On Tue, Jun 3, 2008 at 9:46 AM, Noble Paul ??? ??
 [EMAIL PROTECTED] wrote:
 The field 'id' is repeated for pet also rename it to something else
 say  entity name=pets pk=id
   query=SELECT id,name,birth_date,type_id FROM pets WHERE
 owner_id='${owners.id}'
   parentDeltaQuery=SELECT id FROM owners WHERE
 id=${pets.owner_id}
   field column=id  name=petid/
 /entity

 --Noble

 On Tue, Jun 3, 2008 at 3:28 AM, Julio Castillo [EMAIL PROTECTED]
 wrote:
 Shalin,
 I experimented with it, and the null pointer exception has been taken
 care of. Thank you.

 I have a different problem now. I believe it is a
 syntax/specification problem.

 When importing data, I got the following exceptions:
 SEVERE: Exception while adding:
 SolrInputDocumnt[{comboId=comboId(1.0)={owners-9},
 userName=userName(1.0)={[David, Schroeder]}}]

 org.apache.solr.common.SolrException: Document [null] missing
 required
 field: id
at

 org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:289)
at
 org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImp
 ortHand
 ler.java:263)
...

 The problem arises the moment I try to include nested entities (e.g.
 pets -the problem does not occur if I don't use the transformer, but
 I have to use the transformer because other unrelated entities also have
 id's).
 My data config file looks as follows.

 dataConfig
  document name=doc-1
entity name=owners pk=id
query=select id,first_name,last_name FROM owners
transformer=TemplateTransformer
field column=id  name=comboId
 template=owners-${owners.id}/
field column=first_name name=userName/
field column=last_name  name=userName/

entity name=pets pk=id
query=SELECT id,name,birth_date,type_id FROM pets
 WHERE owner_id='${owners.id}'
parentDeltaQuery=SELECT id FROM owners WHERE
 id=${pets.owner_id}
field column=id  name=id/
field column=namename=name/
field column=birth_date name=birthDate/
/entity
/entity
  /document
 /dataConfig

 The debug output of the data import looks as follows:

 
 - lst name=verbose-output
  - lst name=entity:owners
- lst name=document#1
  str name=queryselect id,first_name,last_name FROM owners/str
  str name=time-taken0:0:0.15/str
  str--- row #1-/str
  int name=id1/int
  str name=first_nameGeorge/str
  str name=last_nameFranklin/str
  str-/str
  - lst name=transformer:TemplateTransformer

Re: How to describe 2 entities in dataConfig for the DataImporter?

2008-06-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
The id in pet should be  aliased to 'petid' , because id is coming
from both entities there is a conflict
entity name=owners pk=id
  query=select id,first_name,last_name FROM owners
  transformer=TemplateTransformer
  field column=comboId  template=owners-${owners.id}/
  field column=id /
  field column=first_name name=userName/
  field column=last_name  name=userName/

  entity name=pets
  query=SELECT id,name,birth_date,type_id FROM pets WHERE
owner_id='${owners.id}'
  parentDeltaQuery=SELECT id FROM owners WHERE
id=${pets.owner_id}
  field column=id  name=petid/
  field column=namename=name/
  field column=birth_date name=birthDate/
  /entity
  /entity


On Wed, Jun 4, 2008 at 10:37 AM, Noble Paul നോബിള്‍ नोब्ळ्
[EMAIL PROTECTED] wrote:
 hi julio,
 You must create an extra field for 'comboid' because you really need
 the 'id' for your sub-entities. Your data-config must look as follows.
 The pet also has a field called 'id' . It is not a good idea. call it
 'petid' or something (both in dataconfig and schema.xml). Please make
 sure that the field names are unique .


 entity name=owners pk=id
   query=select id,first_name,last_name FROM owners
   transformer=TemplateTransformer
   field column=comboId  template=owners-${owners.id}/
   field column=id /
   field column=first_name name=userName/
   field column=last_name  name=userName/

   entity name=pets pk=id
   query=SELECT id,name,birth_date,type_id FROM pets WHERE
 owner_id='${owners.id}'
   parentDeltaQuery=SELECT id FROM owners WHERE
 id=${pets.owner_id}
   field column=id  name=id/
   field column=namename=name/
   field column=birth_date name=birthDate/
   /entity
   /entity


 On Wed, Jun 4, 2008 at 5:50 AM, Julio Castillo [EMAIL PROTECTED] wrote:
 Hi Noble,
 I had forgotten to also list comboId as a uniqueKey in the schema.xml file.
 But that didn't make a difference.
 It still complained about the Document [null] missing required field: id
 for each row it ran into of the outer entity.

 If you look at the debug output of the entity:pets (see below on original
 message).
 The query looks like this:
 SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1'

 This is the problem lies, because, the owner_id in the pets table is
 currently a number and thus will not match the modified combo id generated
 for the owners' id column.

 So, somehow, I need to be able to either remove the 'owners-' suffix before
 comparing, or append the same suffix to the pets.owner_id value prior to
 comparing.

 Thanks

 ** julio

 -Original Message-
 From: Noble Paul ??? ?? [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 02, 2008 9:20 PM
 To: solr-user@lucene.apache.org
 Subject: Re: How to describe 2 entities in dataConfig for the DataImporter?

 hi Julio,
 delete my previous response. In your schema , 'id' is the uniqueKey.
 make  'comboid' the unique key. Because that is the target field name coming
 out of the entity 'owners'

 --Noble

 On Tue, Jun 3, 2008 at 9:46 AM, Noble Paul ??? ??
 [EMAIL PROTECTED] wrote:
 The field 'id' is repeated for pet also rename it to something else
 say  entity name=pets pk=id
   query=SELECT id,name,birth_date,type_id FROM pets WHERE
 owner_id='${owners.id}'
   parentDeltaQuery=SELECT id FROM owners WHERE
 id=${pets.owner_id}
   field column=id  name=petid/
 /entity

 --Noble

 On Tue, Jun 3, 2008 at 3:28 AM, Julio Castillo [EMAIL PROTECTED]
 wrote:
 Shalin,
 I experimented with it, and the null pointer exception has been taken
 care of. Thank you.

 I have a different problem now. I believe it is a
 syntax/specification problem.

 When importing data, I got the following exceptions:
 SEVERE: Exception while adding:
 SolrInputDocumnt[{comboId=comboId(1.0)={owners-9},
 userName=userName(1.0)={[David, Schroeder]}}]

 org.apache.solr.common.SolrException: Document [null] missing
 required
 field: id
at

 org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:289)
at
 org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImp
 ortHand
 ler.java:263)
...

 The problem arises the moment I try to include nested entities (e.g.
 pets -the problem does not occur if I don't use the transformer, but
 I have to use the transformer because other unrelated entities also have
 id's).
 My data config file looks as follows.

 dataConfig
  document name=doc-1
entity name=owners pk=id
query=select id,first_name,last_name FROM owners
transformer=TemplateTransformer
field column=id  name=comboId
 template=owners-${owners.id}/
field column=first_name name=userName/
field column=last_name  name=userName/

entity name=pets pk=id
query=SELECT