Re: dismax query parser crash on double dash
On Mon, Jun 2, 2008 at 11:15 PM, Sean Timm [EMAIL PROTECTED] wrote: It seems that the DisMaxRequestHandler tries hard to handle any query that the user can throw at it. That's exactly why I was reporting it... :-) - Bram
Re: dismax query parser crash on double dash
+1. Fault tolerance good. ParseExceptions bad. Can you open a JIRA issue for it? If you feel you see the problem, a patch would be great, too. -Grant On Jun 2, 2008, at 5:15 PM, Sean Timm wrote: It seems that the DisMaxRequestHandler tries hard to handle any query that the user can throw at it. From http://wiki.apache.org/solr/DisMaxRequestHandler: Quotes can be used to group phrases, and +/- can be used to denote mandatory and optional clauses ... but all other Lucene query parser special characters are escaped to simplify the user experience. The handler takes responsibility for building a good query from the user's input [...] any query containing an odd number of quote characters is evaluated as if there were no quote characters at all. Would it be outside the scope of the DisMaxRequestHandler to also handle improper use of +/-? There are a couple of other cases where a user query could fail to parse. Basically they all boil down to a + or - operator not being followed by a term. A few examples of queries that fail: chocolate cookie - chocolate -+cookie chocolate --cookie chocolate - - cookie -Sean Grant Ingersoll wrote: See http://wiki.apache.org/solr/DisMaxRequestHandler Namely, - is the prohibited operator, thus, -- really is meaningless. You either need to escape them or remove them -Grant On Jun 2, 2008, at 7:14 AM, Bram de Jong wrote: hello all, just a small note to say that the dismax query parser crashes on: q = apple -- pear I'm running through a stored batch of my users' searches and it went down on the double dash :) - Bram -- http://freesound.iua.upf.edu http://www.smartelectronix.com http://www.musicdsp.org -- Grant Ingersoll http://www.lucidimagination.com Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ -- Grant Ingersoll http://www.lucidimagination.com Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ
Re: dismax query parser crash on double dash
I can take a stab at this. I need to see why SOLR-502 isn't working for Otis first though. -Sean Bram de Jong wrote: On Tue, Jun 3, 2008 at 1:26 PM, Grant Ingersoll [EMAIL PROTECTED] wrote: +1. Fault tolerance good. ParseExceptions bad. Can you open a JIRA issue for it? If you feel you see the problem, a patch would be great, too. https://issues.apache.org/jira/browse/SOLR-589 I hope the bug report is detailed enough. As I have no experience whatsoever with Java, me writing a patch would be a Bad Idea (TM) - Bram
Re: dismax query parser crash on double dash
On Tue, Jun 3, 2008 at 1:26 PM, Grant Ingersoll [EMAIL PROTECTED] wrote: +1. Fault tolerance good. ParseExceptions bad. Can you open a JIRA issue for it? If you feel you see the problem, a patch would be great, too. https://issues.apache.org/jira/browse/SOLR-589 I hope the bug report is detailed enough. As I have no experience whatsoever with Java, me writing a patch would be a Bad Idea (TM) - Bram
Re: dismax query parser crash on double dash
On Tue, Jun 3, 2008 at 3:51 PM, Sean Timm [EMAIL PROTECTED] wrote: I can take a stab at this. I need to see why SOLR-502 isn't working for Otis first though. I slightly enhanced my script so it would only do the strange searches my users have done in the past... (i.e things with more than just numbers and letters), and I found two more: and i.e. one double quote and two double quotes I'll add it to the ticket. - bram
RE: Issuing queries during analysis?
Grant Ingersoll wrote: How often does your collection change or get updated? You could also have a slight alternative, which is to create a real small and simple Lucene index that contains your translations and then do it pre-indexing. The code for such a searcher is quite simple, albeit it isn't Solr. Otherwise, you'd have to hack the SolrResourceLoader to recognize your Analyzer as being SolrCoreAware, but, geez, I don't know what the full ramifications of that would be, so caveat emptor. Mike Klaas wrote: Perhaps you could separate the problem, putting this info in separate index or solr core. This sounds like the best approach. I've written a special searcher that handles standardization requests for multiple places in one http call and it was pretty straightforward. That's what I love about SOLR, it's *so* easy to write plugins for. Thank-you for your suggestions! --dallan
Ideas on how to implement sponsored results
Hi all, I'm trying to implement sponsored results in Solr search results similar to that of Google. We index products from various sites and would like to allow certain sites to promote their products. My approach is to query a slave instance to get sponsored results for user queries in addition to the normal search results. This part is easy. However, since the number of products indexed for each sites can be very different (100, 1000, 1 or 6 products), we need a way to fairly distribute the sponsored results among sites. My initial thought is utilising field collapsing patch to collapse the search results on siteId field. You can imagine that this will create a series of buckets of results, each bucket representing results from a site. After that, 2 or 3 buckets will randomly be selected from which I will randomly select one or two results from. However, since I want these sponsored results to be relevant to user queries, I'd like only want to have the first 30 results in each buckets. Obviously, it's desirable that if the user refreshes the page, new sponsored results will be displayed. On the other hand, I also want to have the advantages of Solr cache. What would be the best way to implement this functionality? Thanks. Cheers, Cuong
Re: Ideas on how to implement sponsored results
Cuong, I have implemented sponsored words for a client. I don't know if my working can help you but I will expose it and let you decide. I have an index containing products entries that I created a field called sponsored words. What I do is to boost this field , so when these words are matched in the query that products appear first on my result. 2008/6/3 climbingrose [EMAIL PROTECTED]: Hi all, I'm trying to implement sponsored results in Solr search results similar to that of Google. We index products from various sites and would like to allow certain sites to promote their products. My approach is to query a slave instance to get sponsored results for user queries in addition to the normal search results. This part is easy. However, since the number of products indexed for each sites can be very different (100, 1000, 1 or 6 products), we need a way to fairly distribute the sponsored results among sites. My initial thought is utilising field collapsing patch to collapse the search results on siteId field. You can imagine that this will create a series of buckets of results, each bucket representing results from a site. After that, 2 or 3 buckets will randomly be selected from which I will randomly select one or two results from. However, since I want these sponsored results to be relevant to user queries, I'd like only want to have the first 30 results in each buckets. Obviously, it's desirable that if the user refreshes the page, new sponsored results will be displayed. On the other hand, I also want to have the advantages of Solr cache. What would be the best way to implement this functionality? Thanks. Cheers, Cuong -- Alexander Ramos Jardim
sp.dictionary.threshold parm of spell checker seems unresponsive
I'm playing around with the spell checker on 1.3 nightly build and don't see any effect on changes to the sp.dictionary.threshold in terms of dictionary size. A value of 0.0 seems to create a dictionary of the same size and content as a value of 0.9. (I'd expect a very small dictionary in the latter case.) I think sp.dictionary.threshold is a float parameter, but maybe I'm misunderstanding? And just to be sure, I assume I can alter this parameter prior to issue the rebuild command to build the dictionary -- I don't need to reindex termSourceField between changes? My solrconfig.xml has this definition for the handler: requestHandler name=spellchecker class=solr.SpellCheckerRequestHandler startup=lazy lst name=defaults int name=sp.query.suggestionCount30/int float name=sp.query.accuracy0.5/float /lst str name=sp.dictionary.indexDirspell/str str name=termSourceFielddictionary/str float name=sp.dictionary.threshold0.9/float /requestHandler And schema.xml in case that is somehow relevant: fieldType name=spell class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType field name=dictionary type=spell indexed=true stored=false multiValued=true omitNorms=true / Any advice? I'd definitely like to tighten up the dictionary but it appears to always include terms regardless of their frequency in the source content. Thanks, Ron
Problems using multicore
Hello, I am getting problems running Solr-1.3-trunk with multicores. My multicore.xml file is: multicore adminPath=/admin/multicore persistent=true sharedLib=lib core name=idxArticle instanceDir=idxArticle dataDir=idxArticle/data / core name=idxItem instanceDir=idxItem dataDir=idxItem/data default=true / /multicore I have solr.home pointing the directory containing it. All the involved directories exist. There's a conf directory containing the schema.xml and solrconfig.xml of each core in their respective core directories. When I try to run solr I get: An error occurred during activation of changes, please see the log for details. [image: Message icon - Error] [HTTP:101216]Servlet: SolrServer failed to preload on startup in Web application: apache-solr-1.3-dev.war. org.apache.solr.common.SolrException: error creating core at org.apache.solr.core.SolrCore.getSolrCore(SolrCore.java:306) at org.apache.solr.servlet.SolrServlet.init(SolrServlet.java:46) at javax.servlet.GenericServlet.init(GenericServlet.java:241) at weblogic.servlet.internal.StubSecurityHelper$ServletInitAction.run(StubSecurityHelper.java:282) at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321) at weblogic.security.service.SecurityManager.runAs(Unknown Source) at weblogic.servlet.internal.StubSecurityHelper.createServlet(StubSecurityHelper.java:63) at weblogic.servlet.internal.StubLifecycleHelper.createOneInstance(StubLifecycleHelper.java:58) at weblogic.servlet.internal.StubLifecycleHelper.(StubLifecycleHelper.java:48) at weblogic.servlet.internal.ServletStubImpl.prepareServlet(ServletStubImpl.java:507) at weblogic.servlet.internal.WebAppServletContext.preloadServlet(WebAppServletContext.java:1853) at weblogic.servlet.internal.WebAppServletContext.loadServletsOnStartup(WebAppServletContext.java:1830) at weblogic.servlet.internal.WebAppServletContext.preloadResources(WebAppServletContext.java:1750) at weblogic.servlet.internal.WebAppServletContext.start(WebAppServletContext.java:2909) at weblogic.servlet.internal.WebAppModule.startContexts(WebAppModule.java:973) at weblogic.servlet.internal.WebAppModule.start(WebAppModule.java:361) at weblogic.application.internal.flow.ModuleStateDriver$3.next(ModuleStateDriver.java:204) at weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:26) at weblogic.application.internal.flow.ModuleStateDriver.start(ModuleStateDriver.java:60) at weblogic.application.internal.flow.ScopedModuleDriver.start(ScopedModuleDriver.java:200) at weblogic.application.internal.flow.ModuleListenerInvoker.start(ModuleListenerInvoker.java:117) at weblogic.application.internal.flow.ModuleStateDriver$3.next(ModuleStateDriver.java:204) at weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:26) at weblogic.application.internal.flow.ModuleStateDriver.start(ModuleStateDriver.java:60) at weblogic.application.internal.flow.StartModulesFlow.activate(StartModulesFlow.java:27) at weblogic.application.internal.BaseDeployment$2.next(BaseDeployment.java:635) at weblogic.application.utils.StateMachineDriver.nextState(StateMachineDriver.java:26) at weblogic.application.internal.BaseDeployment.activate(BaseDeployment.java:212) at weblogic.application.internal.DeploymentStateChecker.activate(DeploymentStateChecker.java:154) at weblogic.deploy.internal.targetserver.AppContainerInvoker.activate(AppContainerInvoker.java:80) at weblogic.deploy.internal.targetserver.operations.AbstractOperation.activate(AbstractOperation.java:566) at weblogic.deploy.internal.targetserver.operations.ActivateOperation.activateDeployment(ActivateOperation.java:136) at weblogic.deploy.internal.targetserver.operations.ActivateOperation.doCommit(ActivateOperation.java:104) at weblogic.deploy.internal.targetserver.operations.AbstractOperation.commit(AbstractOperation.java:320) at weblogic.deploy.internal.targetserver.DeploymentManager.handleDeploymentCommit(DeploymentManager.java:816) at weblogic.deploy.internal.targetserver.DeploymentManager.activateDeploymentList(DeploymentManager.java:1223) at weblogic.deploy.internal.targetserver.DeploymentManager.handleCommit(DeploymentManager.java:434) at weblogic.deploy.internal.targetserver.DeploymentServiceDispatcher.commit(DeploymentServiceDispatcher.java:161) at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.doCommitCallback(DeploymentReceiverCallbackDeliverer.java:181) at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer.access$100(DeploymentReceiverCallbackDeliverer.java:12) at weblogic.deploy.service.internal.targetserver.DeploymentReceiverCallbackDeliverer$2.run(DeploymentReceiverCallbackDeliverer.java:67) at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:464) at weblogic.work.ExecuteThread.execute(ExecuteThread.java:200) at weblogic.work.ExecuteThread.run(ExecuteThread.java:172) Caused by:
Re: sp.dictionary.threshold parm of spell checker seems unresponsive
Ron, It might be better for you to look at SOLR-572 issue in Solr's JIRA and use the patch provided there with the Solr trunk. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Ronald K. Braun [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Tuesday, June 3, 2008 1:29:01 PM Subject: sp.dictionary.threshold parm of spell checker seems unresponsive I'm playing around with the spell checker on 1.3 nightly build and don't see any effect on changes to the sp.dictionary.threshold in terms of dictionary size. A value of 0.0 seems to create a dictionary of the same size and content as a value of 0.9. (I'd expect a very small dictionary in the latter case.) I think sp.dictionary.threshold is a float parameter, but maybe I'm misunderstanding? And just to be sure, I assume I can alter this parameter prior to issue the rebuild command to build the dictionary -- I don't need to reindex termSourceField between changes? My solrconfig.xml has this definition for the handler: class=solr.SpellCheckerRequestHandler startup=lazy 30 0.5 spell dictionary 0.9 And schema.xml in case that is somehow relevant: multiValued=true omitNorms=true / Any advice? I'd definitely like to tighten up the dictionary but it appears to always include terms regardless of their frequency in the source content. Thanks, Ron
Re: dismax query parser crash on double dash
Bram, You will slowly discover various characters and tokens that don't work with DisMax. They don't work because they are special - they are a part of the query grammar and have special meanings. Have you tried escaping those characters in your application before sending the query to Solr? Escaping is done with backward slashes. I bet that's in the Lucene FAQ on Lucene's Wiki. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Bram de Jong [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Tuesday, June 3, 2008 11:15:06 AM Subject: Re: dismax query parser crash on double dash On Tue, Jun 3, 2008 at 3:51 PM, Sean Timm wrote: I can take a stab at this. I need to see why SOLR-502 isn't working for Otis first though. I slightly enhanced my script so it would only do the strange searches my users have done in the past... (i.e things with more than just numbers and letters), and I found two more: and i.e. one double quote and two double quotes I'll add it to the ticket. - bram
Solrj + Multicore
Is there a way to access a specific core via Solrj. Sorry but I couldn't find anything on wiki or google. -- Alexander Ramos Jardim
Re: Solrj + Multicore
On Jun 3, 2008, at 3:52 PM, Alexander Ramos Jardim wrote: Is there a way to access a specific core via Solrj Yes, depending on which SolrServer implementation: SolrServer server = new CommonsHttpSolrServer(http://localhost:8983/solr/ corename) -or- SolrServer server = new EmbeddedSolrServer(solrCore) Erik
Re: Solrj + Multicore
Well, This way I connect to my server new CommonsHttpSolrServer(http://localhost:8983/solr/?core=idxItem;) This way I don't connect: new CommonsHttpSolrServer(http://localhost:8983/solr/idxItem;) As you can obviously see, I can't use the first way because it produces wrong requests like http://localhost:8983/solr/?core=idxItem/update?wt=xmlversion=2.2 and I end up getting exceptions like these. org.apache.solr.common.SolrException: Can_not_find_core_idxItemupdatewtxml Can_not_find_core_idxItemupdatewtxml request: http://localhost:8983/solr/?core=idxItem/update?wt=xmlversion=2.2 at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:308) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:152) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:220) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:51) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:55) ... I would like to remeber that I am using solr-1.3 trunk 2008/6/3 Erik Hatcher [EMAIL PROTECTED]: On Jun 3, 2008, at 3:52 PM, Alexander Ramos Jardim wrote: Is there a way to access a specific core via Solrj Yes, depending on which SolrServer implementation: SolrServer server = new CommonsHttpSolrServer( http://localhost:8983/solr/corename) -or- SolrServer server = new EmbeddedSolrServer(solrCore) Erik -- Alexander Ramos Jardim
Re: solr slave configuration help
Hi Yonik and others, We ended up using adding 2 additional GB to physical ram (total of 4GB now) and set the java heap to 3GB, so OS should have 1GB to play with. The slave servers are now a lot more responsive, even during replication and with autowarm turned on (not too aggressive though). It also seems like using the serialGC makes the server more responsive during the time of replication (registering new searcher) unless I'm missing something in the GC configuration. Thanks! -Gaku Yonik Seeley wrote: On Sun, Jun 1, 2008 at 5:20 AM, Gaku Mak [EMAIL PROTECTED] wrote: [...] I also have some test script to query against the slave server; however, whenever during snapinstall, OOM would occur and the server is not very responsive (even with autowarm disabled). After a while (like couple minutes), the server can respond again. Is this expected? Not really expected, no. Is the server unresponsive to a single search request (i.e. it takes a long time to complete)? Are you load testing, or just trying single requests? I have set the heap size to 1.5GB out of the 2GB physical ram. Any help is appreciated. Thanks! Try a smaller heap. The OS needs memory to cache the Lucene index structures too (Lucene does very little caching and depends on the OS to do it for good performance). -Yonik -- View this message in context: http://www.nabble.com/solr-slave-configuration-help-tp17583642p17636257.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: How to describe 2 entities in dataConfig for the DataImporter?
Hi Noble, I had forgotten to also list comboId as a uniqueKey in the schema.xml file. But that didn't make a difference. It still complained about the Document [null] missing required field: id for each row it ran into of the outer entity. If you look at the debug output of the entity:pets (see below on original message). The query looks like this: SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1' This is the problem lies, because, the owner_id in the pets table is currently a number and thus will not match the modified combo id generated for the owners' id column. So, somehow, I need to be able to either remove the 'owners-' suffix before comparing, or append the same suffix to the pets.owner_id value prior to comparing. Thanks ** julio -Original Message- From: Noble Paul ??? ?? [mailto:[EMAIL PROTECTED] Sent: Monday, June 02, 2008 9:20 PM To: solr-user@lucene.apache.org Subject: Re: How to describe 2 entities in dataConfig for the DataImporter? hi Julio, delete my previous response. In your schema , 'id' is the uniqueKey. make 'comboid' the unique key. Because that is the target field name coming out of the entity 'owners' --Noble On Tue, Jun 3, 2008 at 9:46 AM, Noble Paul ??? ?? [EMAIL PROTECTED] wrote: The field 'id' is repeated for pet also rename it to something else say entity name=pets pk=id query=SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='${owners.id}' parentDeltaQuery=SELECT id FROM owners WHERE id=${pets.owner_id} field column=id name=petid/ /entity --Noble On Tue, Jun 3, 2008 at 3:28 AM, Julio Castillo [EMAIL PROTECTED] wrote: Shalin, I experimented with it, and the null pointer exception has been taken care of. Thank you. I have a different problem now. I believe it is a syntax/specification problem. When importing data, I got the following exceptions: SEVERE: Exception while adding: SolrInputDocumnt[{comboId=comboId(1.0)={owners-9}, userName=userName(1.0)={[David, Schroeder]}}] org.apache.solr.common.SolrException: Document [null] missing required field: id at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:289) at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImp ortHand ler.java:263) ... The problem arises the moment I try to include nested entities (e.g. pets -the problem does not occur if I don't use the transformer, but I have to use the transformer because other unrelated entities also have id's). My data config file looks as follows. dataConfig document name=doc-1 entity name=owners pk=id query=select id,first_name,last_name FROM owners transformer=TemplateTransformer field column=id name=comboId template=owners-${owners.id}/ field column=first_name name=userName/ field column=last_name name=userName/ entity name=pets pk=id query=SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='${owners.id}' parentDeltaQuery=SELECT id FROM owners WHERE id=${pets.owner_id} field column=id name=id/ field column=namename=name/ field column=birth_date name=birthDate/ /entity /entity /document /dataConfig The debug output of the data import looks as follows: - lst name=verbose-output - lst name=entity:owners - lst name=document#1 str name=queryselect id,first_name,last_name FROM owners/str str name=time-taken0:0:0.15/str str--- row #1-/str int name=id1/int str name=first_nameGeorge/str str name=last_nameFranklin/str str-/str - lst name=transformer:TemplateTransformer str-/str str name=idowners-1/str str name=first_nameGeorge/str str name=last_nameFranklin/str str-/str - lst name=entity:pets str name=querySELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1'/str str name=time-taken0:0:0.0/str /lst /lst /lst + lst name=document#1 Thanks again ** julio -Original Message- From: Shalin Shekhar Mangar [mailto:[EMAIL PROTECTED] Sent: Saturday, May 31, 2008 10:26 AM To: solr-user@lucene.apache.org Subject: Re: How to describe 2 entities in dataConfig for the DataImporter? Hi Julio, I've fixed the bug, can you please replace the exiting TemplateTransformer.java in the SOLR-469.patch and use the attached TemplateTransformer.java file. We'll add the changes to our next patch. Sorry for all the trouble. On Sat, May 31, 2008 at 10:31 PM, Noble Paul ??? ?? [EMAIL PROTECTED] wrote: julio, Looks like it is a bug. We can give u a new
Re: Ideas on how to implement sponsored results
Hi Alexander, Thanks for your suggestion. I think my problem is a bit different from yours. We don't have any sponsored words but we have to retrieve sponsored results directly from the index. This is because a site can have 60,000 products which is hard to insert/update keywords. I can live with that by issuing a separate query to fetch sponsored results. My problem is to equally distribute sponsored results between sites so that each site will have an opportunity to show their sponsored results no matter how many products they have. For example, if site A has 6 products, site B has only 2000 then sponsored products from site B will have a very small chance to be displayed. On Wed, Jun 4, 2008 at 2:56 AM, Alexander Ramos Jardim [EMAIL PROTECTED] wrote: Cuong, I have implemented sponsored words for a client. I don't know if my working can help you but I will expose it and let you decide. I have an index containing products entries that I created a field called sponsored words. What I do is to boost this field , so when these words are matched in the query that products appear first on my result. 2008/6/3 climbingrose [EMAIL PROTECTED]: Hi all, I'm trying to implement sponsored results in Solr search results similar to that of Google. We index products from various sites and would like to allow certain sites to promote their products. My approach is to query a slave instance to get sponsored results for user queries in addition to the normal search results. This part is easy. However, since the number of products indexed for each sites can be very different (100, 1000, 1 or 6 products), we need a way to fairly distribute the sponsored results among sites. My initial thought is utilising field collapsing patch to collapse the search results on siteId field. You can imagine that this will create a series of buckets of results, each bucket representing results from a site. After that, 2 or 3 buckets will randomly be selected from which I will randomly select one or two results from. However, since I want these sponsored results to be relevant to user queries, I'd like only want to have the first 30 results in each buckets. Obviously, it's desirable that if the user refreshes the page, new sponsored results will be displayed. On the other hand, I also want to have the advantages of Solr cache. What would be the best way to implement this functionality? Thanks. Cheers, Cuong -- Alexander Ramos Jardim -- Regards, Cuong Hoang
Re: Solrj + Multicore
This way I don't connect: new CommonsHttpSolrServer(http://localhost:8983/solr/idxItem;) this is how you need to connect... otherwise nothing will work. Perhaps we should throw an exception if you initialize a URL that contains ? ryan
Re: How to describe 2 entities in dataConfig for the DataImporter?
hi julio, You must create an extra field for 'comboid' because you really need the 'id' for your sub-entities. Your data-config must look as follows. The pet also has a field called 'id' . It is not a good idea. call it 'petid' or something (both in dataconfig and schema.xml). Please make sure that the field names are unique . entity name=owners pk=id query=select id,first_name,last_name FROM owners transformer=TemplateTransformer field column=comboId template=owners-${owners.id}/ field column=id / field column=first_name name=userName/ field column=last_name name=userName/ entity name=pets pk=id query=SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='${owners.id}' parentDeltaQuery=SELECT id FROM owners WHERE id=${pets.owner_id} field column=id name=id/ field column=namename=name/ field column=birth_date name=birthDate/ /entity /entity On Wed, Jun 4, 2008 at 5:50 AM, Julio Castillo [EMAIL PROTECTED] wrote: Hi Noble, I had forgotten to also list comboId as a uniqueKey in the schema.xml file. But that didn't make a difference. It still complained about the Document [null] missing required field: id for each row it ran into of the outer entity. If you look at the debug output of the entity:pets (see below on original message). The query looks like this: SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1' This is the problem lies, because, the owner_id in the pets table is currently a number and thus will not match the modified combo id generated for the owners' id column. So, somehow, I need to be able to either remove the 'owners-' suffix before comparing, or append the same suffix to the pets.owner_id value prior to comparing. Thanks ** julio -Original Message- From: Noble Paul ??? ?? [mailto:[EMAIL PROTECTED] Sent: Monday, June 02, 2008 9:20 PM To: solr-user@lucene.apache.org Subject: Re: How to describe 2 entities in dataConfig for the DataImporter? hi Julio, delete my previous response. In your schema , 'id' is the uniqueKey. make 'comboid' the unique key. Because that is the target field name coming out of the entity 'owners' --Noble On Tue, Jun 3, 2008 at 9:46 AM, Noble Paul ??? ?? [EMAIL PROTECTED] wrote: The field 'id' is repeated for pet also rename it to something else say entity name=pets pk=id query=SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='${owners.id}' parentDeltaQuery=SELECT id FROM owners WHERE id=${pets.owner_id} field column=id name=petid/ /entity --Noble On Tue, Jun 3, 2008 at 3:28 AM, Julio Castillo [EMAIL PROTECTED] wrote: Shalin, I experimented with it, and the null pointer exception has been taken care of. Thank you. I have a different problem now. I believe it is a syntax/specification problem. When importing data, I got the following exceptions: SEVERE: Exception while adding: SolrInputDocumnt[{comboId=comboId(1.0)={owners-9}, userName=userName(1.0)={[David, Schroeder]}}] org.apache.solr.common.SolrException: Document [null] missing required field: id at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:289) at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImp ortHand ler.java:263) ... The problem arises the moment I try to include nested entities (e.g. pets -the problem does not occur if I don't use the transformer, but I have to use the transformer because other unrelated entities also have id's). My data config file looks as follows. dataConfig document name=doc-1 entity name=owners pk=id query=select id,first_name,last_name FROM owners transformer=TemplateTransformer field column=id name=comboId template=owners-${owners.id}/ field column=first_name name=userName/ field column=last_name name=userName/ entity name=pets pk=id query=SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='${owners.id}' parentDeltaQuery=SELECT id FROM owners WHERE id=${pets.owner_id} field column=id name=id/ field column=namename=name/ field column=birth_date name=birthDate/ /entity /entity /document /dataConfig The debug output of the data import looks as follows: - lst name=verbose-output - lst name=entity:owners - lst name=document#1 str name=queryselect id,first_name,last_name FROM owners/str str name=time-taken0:0:0.15/str str--- row #1-/str int name=id1/int str name=first_nameGeorge/str str name=last_nameFranklin/str str-/str - lst name=transformer:TemplateTransformer
Re: How to describe 2 entities in dataConfig for the DataImporter?
The id in pet should be aliased to 'petid' , because id is coming from both entities there is a conflict entity name=owners pk=id query=select id,first_name,last_name FROM owners transformer=TemplateTransformer field column=comboId template=owners-${owners.id}/ field column=id / field column=first_name name=userName/ field column=last_name name=userName/ entity name=pets query=SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='${owners.id}' parentDeltaQuery=SELECT id FROM owners WHERE id=${pets.owner_id} field column=id name=petid/ field column=namename=name/ field column=birth_date name=birthDate/ /entity /entity On Wed, Jun 4, 2008 at 10:37 AM, Noble Paul നോബിള് नोब्ळ् [EMAIL PROTECTED] wrote: hi julio, You must create an extra field for 'comboid' because you really need the 'id' for your sub-entities. Your data-config must look as follows. The pet also has a field called 'id' . It is not a good idea. call it 'petid' or something (both in dataconfig and schema.xml). Please make sure that the field names are unique . entity name=owners pk=id query=select id,first_name,last_name FROM owners transformer=TemplateTransformer field column=comboId template=owners-${owners.id}/ field column=id / field column=first_name name=userName/ field column=last_name name=userName/ entity name=pets pk=id query=SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='${owners.id}' parentDeltaQuery=SELECT id FROM owners WHERE id=${pets.owner_id} field column=id name=id/ field column=namename=name/ field column=birth_date name=birthDate/ /entity /entity On Wed, Jun 4, 2008 at 5:50 AM, Julio Castillo [EMAIL PROTECTED] wrote: Hi Noble, I had forgotten to also list comboId as a uniqueKey in the schema.xml file. But that didn't make a difference. It still complained about the Document [null] missing required field: id for each row it ran into of the outer entity. If you look at the debug output of the entity:pets (see below on original message). The query looks like this: SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='owners-1' This is the problem lies, because, the owner_id in the pets table is currently a number and thus will not match the modified combo id generated for the owners' id column. So, somehow, I need to be able to either remove the 'owners-' suffix before comparing, or append the same suffix to the pets.owner_id value prior to comparing. Thanks ** julio -Original Message- From: Noble Paul ??? ?? [mailto:[EMAIL PROTECTED] Sent: Monday, June 02, 2008 9:20 PM To: solr-user@lucene.apache.org Subject: Re: How to describe 2 entities in dataConfig for the DataImporter? hi Julio, delete my previous response. In your schema , 'id' is the uniqueKey. make 'comboid' the unique key. Because that is the target field name coming out of the entity 'owners' --Noble On Tue, Jun 3, 2008 at 9:46 AM, Noble Paul ??? ?? [EMAIL PROTECTED] wrote: The field 'id' is repeated for pet also rename it to something else say entity name=pets pk=id query=SELECT id,name,birth_date,type_id FROM pets WHERE owner_id='${owners.id}' parentDeltaQuery=SELECT id FROM owners WHERE id=${pets.owner_id} field column=id name=petid/ /entity --Noble On Tue, Jun 3, 2008 at 3:28 AM, Julio Castillo [EMAIL PROTECTED] wrote: Shalin, I experimented with it, and the null pointer exception has been taken care of. Thank you. I have a different problem now. I believe it is a syntax/specification problem. When importing data, I got the following exceptions: SEVERE: Exception while adding: SolrInputDocumnt[{comboId=comboId(1.0)={owners-9}, userName=userName(1.0)={[David, Schroeder]}}] org.apache.solr.common.SolrException: Document [null] missing required field: id at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:289) at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImp ortHand ler.java:263) ... The problem arises the moment I try to include nested entities (e.g. pets -the problem does not occur if I don't use the transformer, but I have to use the transformer because other unrelated entities also have id's). My data config file looks as follows. dataConfig document name=doc-1 entity name=owners pk=id query=select id,first_name,last_name FROM owners transformer=TemplateTransformer field column=id name=comboId template=owners-${owners.id}/ field column=first_name name=userName/ field column=last_name name=userName/ entity name=pets pk=id query=SELECT