Re: Storing queries in Solr

2012-10-08 Thread Gérard Dupont
Hi Jorge,

As far as I know, there isn't built-in component to achieve such function
in Solr (maybe in latest 4.1 that I didn't explored in depth yet). However
I've done myself in the past using different approaches.

The first one is similar to Upayavira's suggestion ans uses an independent
index where queries and clicks where stored in order to make popular
queries suggestion and/or document suggestions. My second implementation
was using a dedicated field on the original documents' index in order to
add terms of queries that lead to a click on each particular document (ie
re-indexing the document with a new field) and using this field as boosted
terms and/or document suggestion. However this later solution is likely to
not scale very well especially if your document index is very dynamic (my
particular case relied on almost static documents repository).

Finally, remember that exploiting queries and clicks may lead to private
data management issues.Since you're storing their queries, warn your users
appropriately.

br,

gdupont

On 8 October 2012 02:24, Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu
 wrote:

 Hi!

 I was wondering if there are any built-in mechanism that allow me to store
 the queries made to a solr server inside the index itself. I know that the
 suggester module exist, but as far as I know it only works for terms
 existing in the index, and not with queries. I remember reading about using
 some external program to parse the solr log and pushing the queries or any
 other interesting data into the index, is this the only way of accomplish
 this?

 Greetings!
 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS
 INFORMATICAS...
 CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION



-- 
Gérard Dupont
Information Processing Control and Cognition (IPCC)
CASSIDIAN - an EADS company

Document  Learning team - LITIS Laboratory


Re: Advanced search in solr

2012-02-01 Thread Gérard Dupont
Hi Ramo,

The answer is Yes. You just need to add a specific field category where
you state the category of each item saved and then issue a request like
[text:whatYouWant AND category:smartphone] thus getting all item that
contain whatYouWant and being int he category you pick.

cheers,

gdupont

On 1 February 2012 13:48, Ramo Karahasan ramo.karaha...@googlemail.comwrote:

 Hi Igor,

 i didn't read through the article, but currently I'm not using faceted
 search.

 I just want to ask, for example for all products from the category X
 name Samsung

 I'll read this article this evening.

 Best regards,
 Ramo



Re: core creation and instanceDir parameter

2011-09-01 Thread Gérard Dupont
On 31 August 2011 20:27, Jaeger, Jay - DOT jay.jae...@dot.wi.gov wrote:

 Well, if it is for creating a *new* core, Solr doesn't know it is pointing
 to your shared conf directory until after you create it, does it?

 JRJ


Indeed, but the conf directory is not a problem for me. The things is I
would like to avoid to send instance path.

-- 
Gérard Dupont
Information Processing Control and Cognition (IPCC)
CASSIDIAN - an EADS company

Document  Learning team - LITIS Laboratory


Re: core creation and instanceDir parameter

2011-08-31 Thread Gérard Dupont
up !

No-one have any clue about this question ? Is it more a dev-related question
?

2011/8/26 Gérard Dupont ger.dup...@gmail.com

 Hi all,

 Playing with multicore and dynamic creation of new core, I found out that
 there is one mandatory parameter instanceDir which is mandaotry to find
 out the location of solrconfig.xml and schema.xml. Since all my cores share
 the same configuration (found realtively to the $SOLR_HOME defined on server
 side) and that all data is saved in the same folder (one sub-folder per
 core), I was wandering why do we still need to send this parameter? In my
 configuration, I would like to avoid that the client, which ask for core
 creation, need to be aware of instance location on the server.

 BTW I'm on solr 3.3.0

 Thanks for any advice.

 --
 Gérard Dupont
 Information Processing Control and Cognition (IPCC)
 CASSIDIAN - an EADS company

 Document  Learning team - LITIS Laboratory




-- 
Gérard Dupont
Information Processing Control and Cognition (IPCC)
CASSIDIAN - an EADS company

Document  Learning team - LITIS Laboratory


Re: Solr and client app on same Jetty?

2011-08-26 Thread Gérard Dupont
Hi,

On 26 August 2011 16:23, Arcadius Ahouansou arcad...@menelic.com wrote:

 Hello.

 I have Solr running on Jetty and I also have a web client application
 running on another jetty instance on the same box.

 The question is: wouldn't it be better to run the client and solr on the
 very same jetty instance?


Don't have clear performance bench on this, but did not notice a lot of
differences during tests with Jetty.


 I came across http://wiki.apache.org/solr/Solrj#EmbeddedSolrServer as
 weel.

 The only drawback I can think of is, in case we would like to scale and
 have
 1 web app against 2 or 3 solr, a code change will be needed.
 - Is there any other drawback in doing so?


We used embedded server for a long time and we moved to standalone server
recently since it should allow more flexibility and independance. No much
code changes.


 - more importantly, any performance or scalability issue?


Standalone server seems more efficient and eventually you can make it scale
independently of your client. But it really depends on your needs. For the
small applications (1M documents and few dozens users) we made last few
years, embedded server was fine.



 Thanks.

 Arcadius.


cheers,

-- 
Gérard Dupont
Information Processing Control and Cognition (IPCC)
CASSIDIAN - an EADS company

Document  Learning team - LITIS Laboratory


core creation and instanceDir parameter

2011-08-26 Thread Gérard Dupont
Hi all,

Playing with multicore and dynamic creation of new core, I found out that
there is one mandatory parameter instanceDir which is mandaotry to find
out the location of solrconfig.xml and schema.xml. Since all my cores share
the same configuration (found realtively to the $SOLR_HOME defined on server
side) and that all data is saved in the same folder (one sub-folder per
core), I was wandering why do we still need to send this parameter? In my
configuration, I would like to avoid that the client, which ask for core
creation, need to be aware of instance location on the server.

BTW I'm on solr 3.3.0

Thanks for any advice.

-- 
Gérard Dupont
Information Processing Control and Cognition (IPCC)
CASSIDIAN - an EADS company

Document  Learning team - LITIS Laboratory


date field

2009-09-08 Thread Gérard Dupont
Hi all,

I'm currently facing a little difficulty to index and search on date field.
The indexing is done in the right way (I guess) and I can find valid date in
the field like 2009-05-01T12:45:32Z. However when I'm searching the user
don't always give an exact date. for instance they give 2008-05-01 to get
all documents related to that day.  I can do a trick using wildcard but is
there another way to do it ? Moreover if they give the full date string (or
if I hack the query parser) I can have the full syntax, but then the :
annoy me because the Lucene parser does not allow it without quotes. Any
ideas ?

-- 
Gérard Dupont
Information Processing Control and Cognition (IPCC) - EADS DS
http://weblab.forge.ow2.org

Document  Learning team - LITIS Laboratory


Re: date field

2009-09-08 Thread Gérard Dupont
Thanks for the answer.

However we don't have strong performance issue (for now) and it that case,
how do you face query where time part is missing ?

On Tue, Sep 8, 2009 at 17:44, Silent Surfer silentsurfe...@yahoo.comwrote:

 Hi,

 If you are still not went live already, I would suggest to use the long
 instead of date field. According to our testing, search based on date fields
 are very slow when compared to search based on long field.

 You can use System.getTimeInMillis() to get the time
 When showing it to the user, apply a date formatter.

 When taking input from user, let him enter whatever the date he wants to
 and then you can convert to long and do your searches based on it.

 Experts can pitch in with any other ideas..

 Thanks,
 sS



Re: A very complex search problem.

2009-09-02 Thread Gérard Dupont
Hi,

The big OR query should be the easiest way and it may work up to ~1000 users
(ie you can specific by default 1024 boolean clause so up to N users in the
OR where N = 1024 - (boolean clause in your query)). You can increase this
limit of boolean clauses in the configuration but I guess too much is
painful. I know that colleagues of me worked on Lucene with up to ~500
boolean wuery ith huge response time constraints and many GB indexes and it
was working fine. I guess SolR will work in the same way.

On Wed, Sep 2, 2009 at 11:47, rajan chandi chandi.ra...@gmail.com wrote:

 Hi All,

 We are dealing with a very complex problem of person specific search.

 We're building a social network where people will post stuff and other
 users
 should be able to see the content only from their contacts.

 e.g. There are 10,000 users in the system and there are only 150 users in
 my
 network.
 I should be search across only 150 users' content.

 Is there an easy way to approach this problem?

 We've come-up with different approaches:-


   - Storing the relationship in each document.
   - A huge ORed query with all the IDs of the people that needs to be
   searched.
   - Creating a query and filtering the results based on the list of
   contacts.

 None of these approach sounds to be plausible.

 We already have gone through recently released book on Solr 1.4 Enterprise
 Search. The book also doesn't seem to have any pointers.

 Any good approach/pointers will help.

 Thanks and regards
 Rajan Chandi




-- 
Gérard Dupont
Information Processing Control and Cognition (IPCC) - EADS DS
http://weblab-project.org

Document  Learning team - LITIS Laboratory


Re: Does the default operator affect phrase searching?

2009-09-02 Thread Gérard Dupont
Hi Dan,

Phrase search (ie using quote) in Lucene does exact match or your expression
so if you type [david pdf] (brackets are there to limit the query in my
mail only) the system search for a document that contain the term 'david'
and the term 'pdf' separated by a space (well in the classic case, I suppose
you don't have a specific query parser). So since your corpus does not
contain any document with david pdf results are empty. In any case, the
defaultOperator have nothing to do with this. It only occur if you do a
query like [david pdf toto] then it will be interpreted as [david pdf OR
toto] (given is is the default operator)

I don't know which other legacy system you also used, but this may be a
complete different query syntax and so quote are not interpreted in the same
way.

HTH

gd

On Wed, Sep 2, 2009 at 22:49, Dan A. Dickey dan.dic...@savvis.net wrote:

 I'm having a problem with doing a phrase search of david pdf.
 When I search for just david, I get 7 hits.  When I search for pdf
 I get 73 hits.  On a legacy system, searching for david pdf I get
 78 hits.  And on Solr (1.4 - one of the nightly builds) - when searching
 for david pdf I get 0 hits.  I have the defaultOperator for my schema
 set to AND - could this be causing the problem?
 When I set {!lucene q.op=OR} in the query, I still get zero hits.

 Suggestions?  Is there any way to debug *why* something didn't hit?
 Or dump out what is contained in the index for one of the records?
 Thanks.
-Dan

 --
 Dan A. Dickey | Senior Software Engineer

 Savvis
 10900 Hampshire Ave. S., Bloomington, MN  55438
 Office: 952.852.4803 | Fax: 952.852.4951
 E-mail: dan.dic...@savvis.net




-- 
Gérard Dupont
Information Processing Control and Cognition (IPCC) - EADS DS
http://weblab-project.org

Document  Learning team - LITIS Laboratory


Re: Does the default operator affect phrase searching?

2009-09-02 Thread Gérard Dupont

 Yes, it does - thanks!
 Back to translating legacy search queries into Solr search queries.  :)
 -Dan


Just curious : what legacy system is it ?


Re: query in solr lucene

2009-07-28 Thread Gérard Dupont
Hi Sushan,

I'm not an expert of Solr, just beginner, but it appears to me that you  may
have default 'OR' combinaison fo keywords so that will explain this
behavior. Try to modify the configuration for an 'AND' combinaison.

cheers

On Tue, Jul 28, 2009 at 16:49, Sushan Rungta s...@clickindia.com wrote:

 I am extremely sorry for responding late as I was ill from past few days.

 My problem is explained below with an example:

 I am having three documents with following list:

 1. Hello how are you
 2. Hello how are you sushan
 3. Hello how are you sushan. I am fine.

 When I search for a query Hello how are you sushan, I should only get
 document 2 in my result.

 I hope this will give you all a better insight in my problem.

 regards,

 Sushan Rungta




-- 
Gérard Dupont
Information Processing Control and Cognition (IPCC) - EADS DS
http://weblab-project.org

Document  Learning team - LITIS Laboratory


SolrJ embedded server : error while adding document

2009-07-20 Thread Gérard Dupont
Hi SolR guys,

I'm starting to play with SolR after few years with classic Lucene. I'm
trying to index a single document using the embedded server, but I got a
strange error which looks like XML parsing problem (see trace hereafter). To
add details, this is a simple Junit which create single document then pass
it to the server in a ArraylistSolrInputDocument. The document only have 2
fields id and text as it is described in the configuration.

ul 20, 2009 5:50:50 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: missing content stream
at
org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:114)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
at
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:147)
at
org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48)
at
org.weblab_project.services.solr.SolrComponent.flushIndexBuffer(SolrComponent.java:132)
at
org.weblab_project.services.solr.SolrComponentTest.testAddOneDocument(SolrComponentTest.java:66)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:154)
at junit.framework.TestCase.runBare(TestCase.java:127)
at junit.framework.TestResult$1.protect(TestResult.java:106)
at junit.framework.TestResult.runProtected(TestResult.java:124)
at junit.framework.TestResult.run(TestResult.java:109)
at junit.framework.TestCase.run(TestCase.java:118)
at junit.framework.TestSuite.runTest(TestSuite.java:208)
at junit.framework.TestSuite.run(TestSuite.java:203)
at
org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130)
at
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:460)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196)

Jul 20, 2009 5:50:50 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=null path=/update params={} status=500 QTime=6
Cannot flush the index buffer : Server error while adding documents

-- 
Gérard Dupont
Information Processing Control and Cognition (IPCC) - EADS DS
http://weblab-project.org

Document  Learning team - LITIS Laboratory


Re: SolrJ embedded server : error while adding document

2009-07-20 Thread Gérard Dupont
my mistake, pb with the buffer I added. But it raises a question : does solr
(using embedded server) has its own buffer mechanism in indexing or not ? I
guess not but I might be wrong.

2009/7/20 Gérard Dupont ger.dup...@gmail.com

 Hi SolR guys,

 I'm starting to play with SolR after few years with classic Lucene. I'm
 trying to index a single document using the embedded server, but I got a
 strange error which looks like XML parsing problem (see trace hereafter). To
 add details, this is a simple Junit which create single document then pass
 it to the server in a ArraylistSolrInputDocument. The document only have 2
 fields id and text as it is described in the configuration.

 ul 20, 2009 5:50:50 PM org.apache.solr.common.SolrException log
 SEVERE: org.apache.solr.common.SolrException: missing content stream
 at
 org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:114)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
 at
 org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:147)
 at
 org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217)
 at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48)
 at
 org.weblab_project.services.solr.SolrComponent.flushIndexBuffer(SolrComponent.java:132)
 at
 org.weblab_project.services.solr.SolrComponentTest.testAddOneDocument(SolrComponentTest.java:66)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at junit.framework.TestCase.runTest(TestCase.java:154)
 at junit.framework.TestCase.runBare(TestCase.java:127)
 at junit.framework.TestResult$1.protect(TestResult.java:106)
 at junit.framework.TestResult.runProtected(TestResult.java:124)
 at junit.framework.TestResult.run(TestResult.java:109)
 at junit.framework.TestCase.run(TestCase.java:118)
 at junit.framework.TestSuite.runTest(TestSuite.java:208)
 at junit.framework.TestSuite.run(TestSuite.java:203)
 at
 org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130)
 at
 org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
 at
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:460)
 at
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673)
 at
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386)
 at
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196)

 Jul 20, 2009 5:50:50 PM org.apache.solr.core.SolrCore execute
 INFO: [] webapp=null path=/update params={} status=500 QTime=6
 Cannot flush the index buffer : Server error while adding documents

 --
 Gérard Dupont
 Information Processing Control and Cognition (IPCC) - EADS DS
 http://weblab-project.org

 Document  Learning team - LITIS Laboratory




-- 
Gérard Dupont
Information Processing Control and Cognition (IPCC) - EADS DS
http://weblab-project.org

Document  Learning team - LITIS Laboratory


Re: SolrJ embedded server : error while adding document

2009-07-20 Thread Gérard Dupont
On Mon, Jul 20, 2009 at 18:35, Ryan McKinley ryan...@gmail.com wrote:

 you send a bunch of requests with add( doc/collection ) and they are not
 visible until you send commit()


That's what I meant thanks.

-- 
Gérard Dupont
Information Processing Control and Cognition (IPCC) - EADS DS
http://weblab-project.org

Document  Learning team - LITIS Laboratory