Re: Storing queries in Solr
Hi Jorge, As far as I know, there isn't built-in component to achieve such function in Solr (maybe in latest 4.1 that I didn't explored in depth yet). However I've done myself in the past using different approaches. The first one is similar to Upayavira's suggestion ans uses an independent index where queries and clicks where stored in order to make popular queries suggestion and/or document suggestions. My second implementation was using a dedicated field on the original documents' index in order to add terms of queries that lead to a click on each particular document (ie re-indexing the document with a new field) and using this field as boosted terms and/or document suggestion. However this later solution is likely to not scale very well especially if your document index is very dynamic (my particular case relied on almost static documents repository). Finally, remember that exploiting queries and clicks may lead to private data management issues.Since you're storing their queries, warn your users appropriately. br, gdupont On 8 October 2012 02:24, Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu wrote: Hi! I was wondering if there are any built-in mechanism that allow me to store the queries made to a solr server inside the index itself. I know that the suggester module exist, but as far as I know it only works for terms existing in the index, and not with queries. I remember reading about using some external program to parse the solr log and pushing the queries or any other interesting data into the index, is this the only way of accomplish this? Greetings! 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION -- Gérard Dupont Information Processing Control and Cognition (IPCC) CASSIDIAN - an EADS company Document Learning team - LITIS Laboratory
Re: Advanced search in solr
Hi Ramo, The answer is Yes. You just need to add a specific field category where you state the category of each item saved and then issue a request like [text:whatYouWant AND category:smartphone] thus getting all item that contain whatYouWant and being int he category you pick. cheers, gdupont On 1 February 2012 13:48, Ramo Karahasan ramo.karaha...@googlemail.comwrote: Hi Igor, i didn't read through the article, but currently I'm not using faceted search. I just want to ask, for example for all products from the category X name Samsung I'll read this article this evening. Best regards, Ramo
Re: core creation and instanceDir parameter
On 31 August 2011 20:27, Jaeger, Jay - DOT jay.jae...@dot.wi.gov wrote: Well, if it is for creating a *new* core, Solr doesn't know it is pointing to your shared conf directory until after you create it, does it? JRJ Indeed, but the conf directory is not a problem for me. The things is I would like to avoid to send instance path. -- Gérard Dupont Information Processing Control and Cognition (IPCC) CASSIDIAN - an EADS company Document Learning team - LITIS Laboratory
Re: core creation and instanceDir parameter
up ! No-one have any clue about this question ? Is it more a dev-related question ? 2011/8/26 Gérard Dupont ger.dup...@gmail.com Hi all, Playing with multicore and dynamic creation of new core, I found out that there is one mandatory parameter instanceDir which is mandaotry to find out the location of solrconfig.xml and schema.xml. Since all my cores share the same configuration (found realtively to the $SOLR_HOME defined on server side) and that all data is saved in the same folder (one sub-folder per core), I was wandering why do we still need to send this parameter? In my configuration, I would like to avoid that the client, which ask for core creation, need to be aware of instance location on the server. BTW I'm on solr 3.3.0 Thanks for any advice. -- Gérard Dupont Information Processing Control and Cognition (IPCC) CASSIDIAN - an EADS company Document Learning team - LITIS Laboratory -- Gérard Dupont Information Processing Control and Cognition (IPCC) CASSIDIAN - an EADS company Document Learning team - LITIS Laboratory
Re: Solr and client app on same Jetty?
Hi, On 26 August 2011 16:23, Arcadius Ahouansou arcad...@menelic.com wrote: Hello. I have Solr running on Jetty and I also have a web client application running on another jetty instance on the same box. The question is: wouldn't it be better to run the client and solr on the very same jetty instance? Don't have clear performance bench on this, but did not notice a lot of differences during tests with Jetty. I came across http://wiki.apache.org/solr/Solrj#EmbeddedSolrServer as weel. The only drawback I can think of is, in case we would like to scale and have 1 web app against 2 or 3 solr, a code change will be needed. - Is there any other drawback in doing so? We used embedded server for a long time and we moved to standalone server recently since it should allow more flexibility and independance. No much code changes. - more importantly, any performance or scalability issue? Standalone server seems more efficient and eventually you can make it scale independently of your client. But it really depends on your needs. For the small applications (1M documents and few dozens users) we made last few years, embedded server was fine. Thanks. Arcadius. cheers, -- Gérard Dupont Information Processing Control and Cognition (IPCC) CASSIDIAN - an EADS company Document Learning team - LITIS Laboratory
core creation and instanceDir parameter
Hi all, Playing with multicore and dynamic creation of new core, I found out that there is one mandatory parameter instanceDir which is mandaotry to find out the location of solrconfig.xml and schema.xml. Since all my cores share the same configuration (found realtively to the $SOLR_HOME defined on server side) and that all data is saved in the same folder (one sub-folder per core), I was wandering why do we still need to send this parameter? In my configuration, I would like to avoid that the client, which ask for core creation, need to be aware of instance location on the server. BTW I'm on solr 3.3.0 Thanks for any advice. -- Gérard Dupont Information Processing Control and Cognition (IPCC) CASSIDIAN - an EADS company Document Learning team - LITIS Laboratory
date field
Hi all, I'm currently facing a little difficulty to index and search on date field. The indexing is done in the right way (I guess) and I can find valid date in the field like 2009-05-01T12:45:32Z. However when I'm searching the user don't always give an exact date. for instance they give 2008-05-01 to get all documents related to that day. I can do a trick using wildcard but is there another way to do it ? Moreover if they give the full date string (or if I hack the query parser) I can have the full syntax, but then the : annoy me because the Lucene parser does not allow it without quotes. Any ideas ? -- Gérard Dupont Information Processing Control and Cognition (IPCC) - EADS DS http://weblab.forge.ow2.org Document Learning team - LITIS Laboratory
Re: date field
Thanks for the answer. However we don't have strong performance issue (for now) and it that case, how do you face query where time part is missing ? On Tue, Sep 8, 2009 at 17:44, Silent Surfer silentsurfe...@yahoo.comwrote: Hi, If you are still not went live already, I would suggest to use the long instead of date field. According to our testing, search based on date fields are very slow when compared to search based on long field. You can use System.getTimeInMillis() to get the time When showing it to the user, apply a date formatter. When taking input from user, let him enter whatever the date he wants to and then you can convert to long and do your searches based on it. Experts can pitch in with any other ideas.. Thanks, sS
Re: A very complex search problem.
Hi, The big OR query should be the easiest way and it may work up to ~1000 users (ie you can specific by default 1024 boolean clause so up to N users in the OR where N = 1024 - (boolean clause in your query)). You can increase this limit of boolean clauses in the configuration but I guess too much is painful. I know that colleagues of me worked on Lucene with up to ~500 boolean wuery ith huge response time constraints and many GB indexes and it was working fine. I guess SolR will work in the same way. On Wed, Sep 2, 2009 at 11:47, rajan chandi chandi.ra...@gmail.com wrote: Hi All, We are dealing with a very complex problem of person specific search. We're building a social network where people will post stuff and other users should be able to see the content only from their contacts. e.g. There are 10,000 users in the system and there are only 150 users in my network. I should be search across only 150 users' content. Is there an easy way to approach this problem? We've come-up with different approaches:- - Storing the relationship in each document. - A huge ORed query with all the IDs of the people that needs to be searched. - Creating a query and filtering the results based on the list of contacts. None of these approach sounds to be plausible. We already have gone through recently released book on Solr 1.4 Enterprise Search. The book also doesn't seem to have any pointers. Any good approach/pointers will help. Thanks and regards Rajan Chandi -- Gérard Dupont Information Processing Control and Cognition (IPCC) - EADS DS http://weblab-project.org Document Learning team - LITIS Laboratory
Re: Does the default operator affect phrase searching?
Hi Dan, Phrase search (ie using quote) in Lucene does exact match or your expression so if you type [david pdf] (brackets are there to limit the query in my mail only) the system search for a document that contain the term 'david' and the term 'pdf' separated by a space (well in the classic case, I suppose you don't have a specific query parser). So since your corpus does not contain any document with david pdf results are empty. In any case, the defaultOperator have nothing to do with this. It only occur if you do a query like [david pdf toto] then it will be interpreted as [david pdf OR toto] (given is is the default operator) I don't know which other legacy system you also used, but this may be a complete different query syntax and so quote are not interpreted in the same way. HTH gd On Wed, Sep 2, 2009 at 22:49, Dan A. Dickey dan.dic...@savvis.net wrote: I'm having a problem with doing a phrase search of david pdf. When I search for just david, I get 7 hits. When I search for pdf I get 73 hits. On a legacy system, searching for david pdf I get 78 hits. And on Solr (1.4 - one of the nightly builds) - when searching for david pdf I get 0 hits. I have the defaultOperator for my schema set to AND - could this be causing the problem? When I set {!lucene q.op=OR} in the query, I still get zero hits. Suggestions? Is there any way to debug *why* something didn't hit? Or dump out what is contained in the index for one of the records? Thanks. -Dan -- Dan A. Dickey | Senior Software Engineer Savvis 10900 Hampshire Ave. S., Bloomington, MN 55438 Office: 952.852.4803 | Fax: 952.852.4951 E-mail: dan.dic...@savvis.net -- Gérard Dupont Information Processing Control and Cognition (IPCC) - EADS DS http://weblab-project.org Document Learning team - LITIS Laboratory
Re: Does the default operator affect phrase searching?
Yes, it does - thanks! Back to translating legacy search queries into Solr search queries. :) -Dan Just curious : what legacy system is it ?
Re: query in solr lucene
Hi Sushan, I'm not an expert of Solr, just beginner, but it appears to me that you may have default 'OR' combinaison fo keywords so that will explain this behavior. Try to modify the configuration for an 'AND' combinaison. cheers On Tue, Jul 28, 2009 at 16:49, Sushan Rungta s...@clickindia.com wrote: I am extremely sorry for responding late as I was ill from past few days. My problem is explained below with an example: I am having three documents with following list: 1. Hello how are you 2. Hello how are you sushan 3. Hello how are you sushan. I am fine. When I search for a query Hello how are you sushan, I should only get document 2 in my result. I hope this will give you all a better insight in my problem. regards, Sushan Rungta -- Gérard Dupont Information Processing Control and Cognition (IPCC) - EADS DS http://weblab-project.org Document Learning team - LITIS Laboratory
SolrJ embedded server : error while adding document
Hi SolR guys, I'm starting to play with SolR after few years with classic Lucene. I'm trying to index a single document using the embedded server, but I got a strange error which looks like XML parsing problem (see trace hereafter). To add details, this is a simple Junit which create single document then pass it to the server in a ArraylistSolrInputDocument. The document only have 2 fields id and text as it is described in the configuration. ul 20, 2009 5:50:50 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: missing content stream at org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:114) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:147) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at org.weblab_project.services.solr.SolrComponent.flushIndexBuffer(SolrComponent.java:132) at org.weblab_project.services.solr.SolrComponentTest.testAddOneDocument(SolrComponentTest.java:66) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:154) at junit.framework.TestCase.runBare(TestCase.java:127) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:118) at junit.framework.TestSuite.runTest(TestSuite.java:208) at junit.framework.TestSuite.run(TestSuite.java:203) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:460) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196) Jul 20, 2009 5:50:50 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=null path=/update params={} status=500 QTime=6 Cannot flush the index buffer : Server error while adding documents -- Gérard Dupont Information Processing Control and Cognition (IPCC) - EADS DS http://weblab-project.org Document Learning team - LITIS Laboratory
Re: SolrJ embedded server : error while adding document
my mistake, pb with the buffer I added. But it raises a question : does solr (using embedded server) has its own buffer mechanism in indexing or not ? I guess not but I might be wrong. 2009/7/20 Gérard Dupont ger.dup...@gmail.com Hi SolR guys, I'm starting to play with SolR after few years with classic Lucene. I'm trying to index a single document using the embedded server, but I got a strange error which looks like XML parsing problem (see trace hereafter). To add details, this is a simple Junit which create single document then pass it to the server in a ArraylistSolrInputDocument. The document only have 2 fields id and text as it is described in the configuration. ul 20, 2009 5:50:50 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: missing content stream at org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:114) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:147) at org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48) at org.weblab_project.services.solr.SolrComponent.flushIndexBuffer(SolrComponent.java:132) at org.weblab_project.services.solr.SolrComponentTest.testAddOneDocument(SolrComponentTest.java:66) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:154) at junit.framework.TestCase.runBare(TestCase.java:127) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:118) at junit.framework.TestSuite.runTest(TestSuite.java:208) at junit.framework.TestSuite.run(TestSuite.java:203) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:460) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196) Jul 20, 2009 5:50:50 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=null path=/update params={} status=500 QTime=6 Cannot flush the index buffer : Server error while adding documents -- Gérard Dupont Information Processing Control and Cognition (IPCC) - EADS DS http://weblab-project.org Document Learning team - LITIS Laboratory -- Gérard Dupont Information Processing Control and Cognition (IPCC) - EADS DS http://weblab-project.org Document Learning team - LITIS Laboratory
Re: SolrJ embedded server : error while adding document
On Mon, Jul 20, 2009 at 18:35, Ryan McKinley ryan...@gmail.com wrote: you send a bunch of requests with add( doc/collection ) and they are not visible until you send commit() That's what I meant thanks. -- Gérard Dupont Information Processing Control and Cognition (IPCC) - EADS DS http://weblab-project.org Document Learning team - LITIS Laboratory