Re: How to group result when search on multiple fields

2011-01-27 Thread Stefan Matheis
On Thu, Jan 27, 2011 at 1:25 AM, cyang2010 ysxsu...@hotmail.com wrote:


 Is Field Collapsing a new feature for solr 4.0 (not yet released yet)?


That's at least what the Wiki tells you, yes.


Question About Writing Custom Query Parser Plugin

2011-01-27 Thread Ahson Iqbal
Hi All

I want to integrate lucene Surround Query Parser with solr 1.4.1, and for that 
I 
am writing Custom Query Parser Plugin, To accomplish this task I should write a 
sub class of org.apache.solr.search.QParserPlugin and implement its two 
methods 

public void init(NamedList nl)
public QParser createParser(String string, SolrParams sp, SolrParams sp1, 
SolrQueryRequest sqr)

now here createParser should return an object of a subclass of 
org.apache.solr.search.QParser, but I need a parser of type 
org.apache.lucene.queryParser.surround.parser.QueryParser which is not a 
subclass of org.apache.solr.search.QParser

Now my question is should I write a sub class 
of org.apache.solr.search.QParser and internally create an object 
of org.apache.lucene.queryParser.surround.parser.QueryParser and call its 
parse method? if so how the mapping 
org.apache.lucene.queryParser.surround.query.SrndQuery (that is 
returned org.apache.lucene.queryParser.surround.parser.QueryParser ) would be 
done with org.apache.lucene.search.Query (that should be returned from parse 
method of a query parser of type org.apache.solr.search.QParser)

Thanx 
Ahsan


  

Re: Does solr supports indexing of files other than UTF-8

2011-01-27 Thread Paul Libbrecht
Why is converting documents to utf-8 not feasible?
Nowadays any platform offers such services.

Can you give a detailed failure description (maybe with the URL to a sample 
document you post)?

paul


Le 27 janv. 2011 à 07:31, prasad deshpande a écrit :
 I am able to successfully index/search non-Engilsh data(like Hebrew,
 Japnese) which was encoded in UTF-8.
 However, When I tried to index data which was encoded in local encoding like
 Big5 for Japanese I could not see the desired results.
 The contents after indexing looked garbled for Big5 encoded document when I
 searched for all indexed documents.
 
 Converting a complete document in UTF-8 is not feasible.
 I am not very clear about how Solr support these localizations with other
 than UTF-8 encoding.
 
 
 I verified below links
 1. http://lucene.apache.org/java/3_0_3/api/all/index.html
 2.  http://wiki.apache.org/solr/LanguageAnalysis
 
 Thanks and Regards,
 Prasad



Re: A Maven archetype that helps packaging Solr as a standalone application embedded in Apache Tomcat

2011-01-27 Thread Paul Libbrecht
Simone,

It's good that you did so! I had found this three days ago while googling.
And I am starting to make sense of it. It works well.

Two little comments:

- you are saying that it packages a standalone multicore and a standalone app. 
But it actually also packs a webapp.
  At first, I had rejected using that option because of the standalone output. 
I think a  webapp is more usable. Just a matter of formulation

- I have found how to configure my schema and config, could add the velocity 
contrib to it, but I haven't yet found out how to add further resources. Both 
src/main/webapp and src/main/resources are ignored.

Help for the latter would be nice.

paul


Le 27 janv. 2011 à 07:58, Simone Tripodi a écrit :

 Hi all guys,
 this short mail just to make the Maven/Solr communities aware that we
 published an Apache Maven archetype[1] (that we lazily called
 'solr-packager' :P) that helps Apache Solr developers creating
 complete standalone Solr-based applications, embedded in Apache
 Tomcat, with few operations.
 We started developing it internally to reduce and help the `ops`
 tasks, since it has been useful we hope it could be also for you, so
 decided to publish it as oss.
 Questions, feedbacks, constructive criticisms, ideas... are more than
 welcome, if interested visit the github[2] page.
 Have a nice day, all the best
 Simo
 
 [1] http://sourcesense.github.com/solr-packager/
 [2] https://github.com/sourcesense/solr-packager
 
 http://people.apache.org/~simonetripodi/
 http://www.99soft.org/



Re: Does solr supports indexing of files other than UTF-8

2011-01-27 Thread prasad deshpande
The size of docs can be huge, like suppose there are 800MB pdf file to index
it I need to translate it in UTF-8 and then send this file to index. Now
suppose there can be any number of clients who can upload file. at that time
it will affect performance. and already our product support localization
with local encoding.

Thanks,
Prasad

On Thu, Jan 27, 2011 at 2:04 PM, Paul Libbrecht p...@hoplahup.net wrote:

 Why is converting documents to utf-8 not feasible?
 Nowadays any platform offers such services.

 Can you give a detailed failure description (maybe with the URL to a sample
 document you post)?

 paul


 Le 27 janv. 2011 à 07:31, prasad deshpande a écrit :
  I am able to successfully index/search non-Engilsh data(like Hebrew,
  Japnese) which was encoded in UTF-8.
  However, When I tried to index data which was encoded in local encoding
 like
  Big5 for Japanese I could not see the desired results.
  The contents after indexing looked garbled for Big5 encoded document when
 I
  searched for all indexed documents.
 
  Converting a complete document in UTF-8 is not feasible.
  I am not very clear about how Solr support these localizations with other
  than UTF-8 encoding.
 
 
  I verified below links
  1. http://lucene.apache.org/java/3_0_3/api/all/index.html
  2.  http://wiki.apache.org/solr/LanguageAnalysis
 
  Thanks and Regards,
  Prasad




DismaxParser Query

2011-01-27 Thread Isan Fulia
Hi all,
The query for standard request handler is as follows
field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2) OR
field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4) AND
field5:(keyword5)


How the same above query can be written for dismax request handler

-- 
Thanks  Regards,
Isan Fulia.


Re: Does solr supports indexing of files other than UTF-8

2011-01-27 Thread Paul Libbrecht
At least in java utf-8 transcoding is done on a stream basis. No issue there.

paul


Le 27 janv. 2011 à 09:51, prasad deshpande a écrit :

 The size of docs can be huge, like suppose there are 800MB pdf file to index
 it I need to translate it in UTF-8 and then send this file to index. Now
 suppose there can be any number of clients who can upload file. at that time
 it will affect performance. and already our product support localization
 with local encoding.
 
 Thanks,
 Prasad
 
 On Thu, Jan 27, 2011 at 2:04 PM, Paul Libbrecht p...@hoplahup.net wrote:
 
 Why is converting documents to utf-8 not feasible?
 Nowadays any platform offers such services.
 
 Can you give a detailed failure description (maybe with the URL to a sample
 document you post)?
 
 paul
 
 
 Le 27 janv. 2011 à 07:31, prasad deshpande a écrit :
 I am able to successfully index/search non-Engilsh data(like Hebrew,
 Japnese) which was encoded in UTF-8.
 However, When I tried to index data which was encoded in local encoding
 like
 Big5 for Japanese I could not see the desired results.
 The contents after indexing looked garbled for Big5 encoded document when
 I
 searched for all indexed documents.
 
 Converting a complete document in UTF-8 is not feasible.
 I am not very clear about how Solr support these localizations with other
 than UTF-8 encoding.
 
 
 I verified below links
 1. http://lucene.apache.org/java/3_0_3/api/all/index.html
 2.  http://wiki.apache.org/solr/LanguageAnalysis
 
 Thanks and Regards,
 Prasad
 
 



Tika config in ExtractingRequestHandler

2011-01-27 Thread Erlend Garåsen


The wiki page for the ExtractingRequestHandler says that I can add the 
following configuration:

str name=tika.config/my/path/to/tika.config/str

I have tried to google for an example of such a Tika config file, but 
haven't found anything.


Erlend

--
Erlend Garåsen
Center for Information Technology Services
University of Oslo
P.O. Box 1086 Blindern, N-0317 OSLO, Norway
Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050


Post PDF to solr with asp.net

2011-01-27 Thread Andrew McCombe
Hi

We are trying to post some PDF documents to solr for indexing using ASP.net
but cannot find any documentation or a library that will allow posting of
binary data.

Has anyone done this and if so, how?

Regards
Andrew McCombe
iWeb Solutions Ltd.


query range in multivalued date field

2011-01-27 Thread ramzesua

hi all. My query range for multivalued date field work incorrect.
My schema. There is field requestDate that have multivalued attr.:
fields   
   field name=id type=string indexed=true stored=true
required=true /
   field name=keyword type=text indexed=true stored=true /
   field name=count  type=float indexed=true stored=true /
   field name=isResult  type=int indexed=true stored=true
default=0 multiValued=true /
   field name=requestDate  type=date indexed=true stored=true
multiValued=true /
 /fields

Some data from the index:

doc 
  float name=count2.0/float 
  str name=idsale/str 
  arr name=isResultint1/intint1/int/arr 
  str name=keywordsale/str 
  arr
name=requestDatedate2011-01-26T08:18:35Z/datedate2011-01-27T01:31:28Z/date/arr
 
 /doc 
 doc 
  float name=count3.0/float 
  str name=idcoldpop/str 
  arr name=isResultint1/intint1/intint1/int/arr 
  str name=keywordcold pop/str 
  arr
name=requestDatedate2011-01-27T01:30:01Z/datedate2011-01-27T01:32:01Z/datedate2011-01-27T01:32:18Z/date/arr
 
 /doc 

I try to search some docs where date is in some range, for example,
http://localhost:8983/request/select?q=requestDate:[NOW/HOUR-1HOUR TO
NOW/HOUR]
There are no result. After some analyzing, I saw, that this range works only
for first item in the requestDate field, but don't filtered for another
items. Where is my mistake? Or SOLR can't filtered multivalued date fields?
Thanks
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/query-range-in-multivalued-date-field-tp2361292p2361292.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: DismaxParser Query

2011-01-27 Thread lee carroll
use dismax q for first three fields and a filter query for the 4th and 5th
fields
so
q=keyword1 keyword 2
qf = field1,feild2,field3
pf = field1,feild2,field3
mm=something sensible for you
defType=dismax
fq= field4:(keyword3 OR keyword4) AND field5:(keyword5)

take a look at the dismax docs for extra params



On 27 January 2011 08:52, Isan Fulia isan.fu...@germinait.com wrote:

 Hi all,
 The query for standard request handler is as follows
 field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2) OR
 field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4) AND
 field5:(keyword5)


 How the same above query can be written for dismax request handler

 --
 Thanks  Regards,
 Isan Fulia.



Re: DismaxParser Query

2011-01-27 Thread Isan Fulia
but q=keyword1 keyword2  does AND operation  not OR

On 27 January 2011 16:22, lee carroll lee.a.carr...@googlemail.com wrote:

 use dismax q for first three fields and a filter query for the 4th and 5th
 fields
 so
 q=keyword1 keyword 2
 qf = field1,feild2,field3
 pf = field1,feild2,field3
 mm=something sensible for you
 defType=dismax
 fq= field4:(keyword3 OR keyword4) AND field5:(keyword5)

 take a look at the dismax docs for extra params



 On 27 January 2011 08:52, Isan Fulia isan.fu...@germinait.com wrote:

  Hi all,
  The query for standard request handler is as follows
  field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2) OR
  field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4) AND
  field5:(keyword5)
 
 
  How the same above query can be written for dismax request handler
 
  --
  Thanks  Regards,
  Isan Fulia.
 




-- 
Thanks  Regards,
Isan Fulia.


DIH and duplicate content

2011-01-27 Thread Rosa (Anuncios)

Hi,

Is there a way to avoid duplicate content in a index at the moment i'm 
uploading my xml feed via DIH?


I would like to have only one entry for a given description. I mean if 
the desciption of one product already exist in index not import this new 
product.


Is there a built in function? Or any hack?

thanks for your help

Rosa



Re: DismaxParser Query

2011-01-27 Thread lee carroll
the default operation can be set in your config to be or or on the query
something like q.op=OR



On 27 January 2011 11:26, Isan Fulia isan.fu...@germinait.com wrote:

 but q=keyword1 keyword2  does AND operation  not OR

 On 27 January 2011 16:22, lee carroll lee.a.carr...@googlemail.com
 wrote:

  use dismax q for first three fields and a filter query for the 4th and
 5th
  fields
  so
  q=keyword1 keyword 2
  qf = field1,feild2,field3
  pf = field1,feild2,field3
  mm=something sensible for you
  defType=dismax
  fq= field4:(keyword3 OR keyword4) AND field5:(keyword5)
 
  take a look at the dismax docs for extra params
 
 
 
  On 27 January 2011 08:52, Isan Fulia isan.fu...@germinait.com wrote:
 
   Hi all,
   The query for standard request handler is as follows
   field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2) OR
   field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4) AND
   field5:(keyword5)
  
  
   How the same above query can be written for dismax request handler
  
   --
   Thanks  Regards,
   Isan Fulia.
  
 



 --
 Thanks  Regards,
 Isan Fulia.



Re: DismaxParser Query

2011-01-27 Thread Bijeet Singh
The DisMax query parser internally hard-codes its operator to OR.
This is quite unlike the Lucene query parser, for which the default operator
can be configured using the solrQueryParser in schema.xml

Regards,

Bijeet Singh

On Thu, Jan 27, 2011 at 4:56 PM, Isan Fulia isan.fu...@germinait.comwrote:

 but q=keyword1 keyword2  does AND operation  not OR

 On 27 January 2011 16:22, lee carroll lee.a.carr...@googlemail.com
 wrote:

  use dismax q for first three fields and a filter query for the 4th and
 5th
  fields
  so
  q=keyword1 keyword 2
  qf = field1,feild2,field3
  pf = field1,feild2,field3
  mm=something sensible for you
  defType=dismax
  fq= field4:(keyword3 OR keyword4) AND field5:(keyword5)
 
  take a look at the dismax docs for extra params
 
 
 
  On 27 January 2011 08:52, Isan Fulia isan.fu...@germinait.com wrote:
 
   Hi all,
   The query for standard request handler is as follows
   field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2) OR
   field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4) AND
   field5:(keyword5)
  
  
   How the same above query can be written for dismax request handler
  
   --
   Thanks  Regards,
   Isan Fulia.
  
 



 --
 Thanks  Regards,
 Isan Fulia.



Re: DismaxParser Query

2011-01-27 Thread lee carroll
sorry ignore that - we are on dismax here - look at mm param in the docs
you can set this to achieve what you need

On 27 January 2011 11:34, lee carroll lee.a.carr...@googlemail.com wrote:

 the default operation can be set in your config to be or or on the query
 something like q.op=OR



 On 27 January 2011 11:26, Isan Fulia isan.fu...@germinait.com wrote:

 but q=keyword1 keyword2  does AND operation  not OR

 On 27 January 2011 16:22, lee carroll lee.a.carr...@googlemail.com
 wrote:

  use dismax q for first three fields and a filter query for the 4th and
 5th
  fields
  so
  q=keyword1 keyword 2
  qf = field1,feild2,field3
  pf = field1,feild2,field3
  mm=something sensible for you
  defType=dismax
  fq= field4:(keyword3 OR keyword4) AND field5:(keyword5)
 
  take a look at the dismax docs for extra params
 
 
 
  On 27 January 2011 08:52, Isan Fulia isan.fu...@germinait.com wrote:
 
   Hi all,
   The query for standard request handler is as follows
   field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2) OR
   field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4) AND
   field5:(keyword5)
  
  
   How the same above query can be written for dismax request handler
  
   --
   Thanks  Regards,
   Isan Fulia.
  
 



 --
 Thanks  Regards,
 Isan Fulia.





Re: DIH and duplicate content

2011-01-27 Thread Markus Jelsma
http://wiki.apache.org/solr/Deduplication


On Thursday 27 January 2011 12:32:29 Rosa (Anuncios) wrote:
 Is there a way to avoid duplicate content in a index at the moment i'm 
 uploading my xml feed via DIH?
 
 I would like to have only one entry for a given description. I mean if 
 the desciption of one product already exist in index not import this new 
 product.

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350


Re: A Maven archetype that helps packaging Solr as a standalone application embedded in Apache Tomcat

2011-01-27 Thread Simone Tripodi
Hi Paul,
thanks a lot for your feedbacks, much more than appreciated! :)

Going through your comments:

 * Yes it also packs a Solr webepp, it is needed to embed it in
Tomcat. Do you think it could be a useful feature having also webapp
.war as output? if it helps, I'm open to add it as well.

 * src/main/webapp and src/main/resources are ignored because I didn't
use the war plugin, everything is configured in the assembly
descriptor ATM. As a workaround, you can add resources on src/solr/*
subdirectory and it will be included in the webapp; when the war
plugin will be plugged (previous comment), that issue should be
solved.

Can you tell me a little more about the velocity contrib, please? In
the multicore, I'd like the solr.xml will be generated during the
build-time analyzing the dependencies but I didn't figure out how to
do it. Many thanks in advance!

http://people.apache.org/~simonetripodi/
http://www.99soft.org/



On Thu, Jan 27, 2011 at 9:49 AM, Paul Libbrecht p...@hoplahup.net wrote:
 Simone,

 It's good that you did so! I had found this three days ago while googling.
 And I am starting to make sense of it. It works well.

 Two little comments:

 - you are saying that it packages a standalone multicore and a standalone 
 app. But it actually also packs a webapp.
  At first, I had rejected using that option because of the standalone output. 
 I think a  webapp is more usable. Just a matter of formulation

 - I have found how to configure my schema and config, could add the velocity 
 contrib to it, but I haven't yet found out how to add further resources. Both 
 src/main/webapp and src/main/resources are ignored.

 Help for the latter would be nice.

 paul


 Le 27 janv. 2011 à 07:58, Simone Tripodi a écrit :

 Hi all guys,
 this short mail just to make the Maven/Solr communities aware that we
 published an Apache Maven archetype[1] (that we lazily called
 'solr-packager' :P) that helps Apache Solr developers creating
 complete standalone Solr-based applications, embedded in Apache
 Tomcat, with few operations.
 We started developing it internally to reduce and help the `ops`
 tasks, since it has been useful we hope it could be also for you, so
 decided to publish it as oss.
 Questions, feedbacks, constructive criticisms, ideas... are more than
 welcome, if interested visit the github[2] page.
 Have a nice day, all the best
 Simo

 [1] http://sourcesense.github.com/solr-packager/
 [2] https://github.com/sourcesense/solr-packager

 http://people.apache.org/~simonetripodi/
 http://www.99soft.org/




Re: configure httpclient to access solr with user credential on third party host

2011-01-27 Thread Upayavira
Looks like you are connecting to Tomcat's AJP port, not the HTTP one.
Connect to the Tomcat HTTP port and I suspect you'll have greater
success.

Upayavira

On Wed, 26 Jan 2011 22:45 -0800, Darniz rnizamud...@edmunds.com
wrote:
 
 Hello,
 i uploaded solr.war file on my hosting provider and added security
 constraint in web.xml file on my solr war so that only specific user with
 a
 certain role can issue get and post request. When i open browser and type
 www.maydomainname.com/solr i get a dialog box to enter userid and
 password.
 No issues until now.
 
 Now the issue is that i have one more app  on the same tomcat container
 which will index document into solr. In order for this app to issue post
 request it has to configure the http client credentials. I checked with
 my
 hosting service and they told me at tomcat is running on port 8834 since
 apache is sitting in the front, the below is the code snipped i use to
 set
 http credentials.
 
 CommonsHttpSolrServer server = new
 CommonsHttpSolrServer(http://localhost:8834/solr;);
   Credentials defaultcreds = new
 UsernamePasswordCredentials(solr,solr);
   server.getHttpClient().getState().setCredentials(new
 AuthScope(localhost,8834,AuthScope.ANY_REALM),
 defaultcreds);
 
 i am getting the following error, any help will be appreciated.
 ERROR TP-Processor9 org.apache.jk.common.MsgAjp - BAD packet signature
 20559
 ERROR TP-Processor9 org.apache.jk.common.ChannelSocket - Error,
 processing
 connection
 java.lang.IndexOutOfBoundsException
 at java.io.BufferedInputStream.read(BufferedInputStream.java:310)
 at
 org.apache.jk.common.ChannelSocket.read(ChannelSocket.java:621)
 at
 org.apache.jk.common.ChannelSocket.receive(ChannelSocket.java:578)
 at
 org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:686)
 at
 org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:891)
 at
 org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
 at java.lang.Thread.run(Thread.java:619)
 ERROR TP-Processor9 org.apache.jk.common.MsgAjp - BAD packet signature
 20559
 ERROR TP-Processor9 org.apache.jk.common.ChannelSocket - Error,
 processing
 connection
 java.lang.IndexOutOfBoundsException
 at java.io.BufferedInputStream.read(BufferedInputStream.java:310)
 at
 org.apache.jk.common.ChannelSocket.read(ChannelSocket.java:621)
 at
 org.apache.jk.common.ChannelSocket.receive(ChannelSocket.java:578)
 at
 org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:686)
 at
 org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:891)
 at
 org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
 at java.lang.Thread.run(Thread.java:619)
 
 
 -- 
 View this message in context:
 http://lucene.472066.n3.nabble.com/configure-httpclient-to-access-solr-with-user-credential-on-third-party-host-tp2360364p2360364.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
--- 
Enterprise Search Consultant at Sourcesense UK, 
Making Sense of Open Source



Re: A Maven archetype that helps packaging Solr as a standalone application embedded in Apache Tomcat

2011-01-27 Thread Paul Libbrecht

Le 27 janv. 2011 à 12:42, Simone Tripodi a écrit :
 thanks a lot for your feedbacks, much more than appreciated! :)

Good time sync. I need it right now.

 * Yes it also packs a Solr webepp, it is needed to embed it in
 Tomcat. Do you think it could be a useful feature having also webapp
 .war as output? if it helps, I'm open to add it as well.

I feel so.
Or at least say that it's a side production even if it's not an individual goal.

 * src/main/webapp and src/main/resources are ignored because I didn't
 use the war plugin, everything is configured in the assembly
 descriptor ATM. As a workaround, you can add resources on src/solr/*
 subdirectory and it will be included in the webapp;

But only in WEB-INF/classes... that doesn't seem right to be served as a static 
resource (I'm looking at css or js files).

 when the war plugin will be plugged (previous comment), that issue should be
 solved.

Any time estimate?

 Can you tell me a little more about the velocity contrib, please?

I added the dependency.
I copied in src/main/solr/commons the velocity config files.

I note that I had to deactivate the query-elevation which seems to expect a 
solr-home.

 In the multicore, I'd like the solr.xml will be generated during the
 build-time analyzing the dependencies but I didn't figure out how to
 do it. Many thanks in advance!

I should also say. At first I tried the multicore one and it failed on me... 
not too sure why but it did not have sufficient output.

paul

Re: query range in multivalued date field

2011-01-27 Thread Erick Erickson
Range queries work on multivalued fields. I suspect the date math
conversion is fooling you. For instance,NOW/HOUR first rounds down to
the current hour, *then* subtracts one hour.

If you attach debugQuery=on (or check the debug checkbox
in the admin full search page), you'll see the exact results of
the conversion, that may help.

Best
Erick

On Thu, Jan 27, 2011 at 5:15 AM, ramzesua michaelnaza...@gmail.com wrote:


 hi all. My query range for multivalued date field work incorrect.
 My schema. There is field requestDate that have multivalued attr.:
 fields
   field name=id type=string indexed=true stored=true
 required=true /
   field name=keyword type=text indexed=true stored=true /
   field name=count  type=float indexed=true stored=true /
   field name=isResult  type=int indexed=true stored=true
 default=0 multiValued=true /
   field name=requestDate  type=date indexed=true stored=true
 multiValued=true /
  /fields

 Some data from the index:

 doc
  float name=count2.0/float
  str name=idsale/str
  arr name=isResultint1/intint1/int/arr
  str name=keywordsale/str
  arr

 name=requestDatedate2011-01-26T08:18:35Z/datedate2011-01-27T01:31:28Z/date/arr
  /doc
  doc
  float name=count3.0/float
  str name=idcoldpop/str
  arr name=isResultint1/intint1/intint1/int/arr
  str name=keywordcold pop/str
  arr

 name=requestDatedate2011-01-27T01:30:01Z/datedate2011-01-27T01:32:01Z/datedate2011-01-27T01:32:18Z/date/arr
  /doc

 I try to search some docs where date is in some range, for example,
 http://localhost:8983/request/select?q=requestDate:[NOW/HOUR-1HOUR TO
 NOW/HOUR]
 There are no result. After some analyzing, I saw, that this range works
 only
 for first item in the requestDate field, but don't filtered for another
 items. Where is my mistake? Or SOLR can't filtered multivalued date fields?
 Thanks
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/query-range-in-multivalued-date-field-tp2361292p2361292.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: DismaxParser Query

2011-01-27 Thread Isan Fulia
It worked by making mm=0 (it acted as OR operator)
but how to handle this

field1:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
field2:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
field3:((keyword1 AND keyword2) OR (keyword3 AND keyword4))




On 27 January 2011 17:06, lee carroll lee.a.carr...@googlemail.com wrote:

 sorry ignore that - we are on dismax here - look at mm param in the docs
 you can set this to achieve what you need

 On 27 January 2011 11:34, lee carroll lee.a.carr...@googlemail.com
 wrote:

  the default operation can be set in your config to be or or on the
 query
  something like q.op=OR
 
 
 
  On 27 January 2011 11:26, Isan Fulia isan.fu...@germinait.com wrote:
 
  but q=keyword1 keyword2  does AND operation  not OR
 
  On 27 January 2011 16:22, lee carroll lee.a.carr...@googlemail.com
  wrote:
 
   use dismax q for first three fields and a filter query for the 4th and
  5th
   fields
   so
   q=keyword1 keyword 2
   qf = field1,feild2,field3
   pf = field1,feild2,field3
   mm=something sensible for you
   defType=dismax
   fq= field4:(keyword3 OR keyword4) AND field5:(keyword5)
  
   take a look at the dismax docs for extra params
  
  
  
   On 27 January 2011 08:52, Isan Fulia isan.fu...@germinait.com
 wrote:
  
Hi all,
The query for standard request handler is as follows
field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2) OR
field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4) AND
field5:(keyword5)
   
   
How the same above query can be written for dismax request handler
   
--
Thanks  Regards,
Isan Fulia.
   
  
 
 
 
  --
  Thanks  Regards,
  Isan Fulia.
 
 
 




-- 
Thanks  Regards,
Isan Fulia.


Re: How to find Master Slave are in sync

2011-01-27 Thread Shanmugavel SRD

Markus,
  The problem here is if I call the below two URLs immediately after
replication then I am getting both the index versions as same. In my python
script I have added code to swap the online core on master with offline core
on master and online core on slave with offline core on slave, if both the
versions are same. After calling swap, I am getting error in slave's log
like below. 
  So I am confused why this is happening. Can you please help me on this?

 http://master_host:port/solr/replication?command=indexversion
 http://slave_host:port/solr/replication?command=details


2011-01-27 07:45:26,713 WARN  [org.apache.solr.handler.SnapPuller]
(Thread-59) No content recieved for file: {size=154098810, name=_e3.cfx,
lastmodified=1296132092000}
2011-01-27 07:45:27,396 ERROR [org.apache.solr.handler.ReplicationHandler]
(Thread-59) SnapPull failed

org.apache.solr.common.SolrException: Unable to download _e3.cfx
completely. Downloaded 0!=154098810
at
org.apache.solr.handler.SnapPuller$FileFetcher.cleanup(SnapPuller.java:1026)
at
org.apache.solr.handler.SnapPuller$FileFetcher.fetchFile(SnapPuller.java:906)
at
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:541)
at
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:294)

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-find-Master-Slave-are-in-sync-tp2287014p2362679.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Post PDF to solr with asp.net

2011-01-27 Thread Gora Mohanty
On Thu, Jan 27, 2011 at 3:44 PM, Andrew McCombe eupe...@gmail.com wrote:
 Hi

 We are trying to post some PDF documents to solr for indexing using ASP.net
 but cannot find any documentation or a library that will allow posting of
 binary data.
[...]

Do not have much idea of ASP.net, but SolrNet
( http://code.google.com/p/solrnet/ ) seems to be
one such library.

Also, one can use Solr's web interface to POST
documents. Please see
http://wiki.apache.org/solr/UpdateXmlMessages
and a shell script example included as
example/exampledocs/post.sh in the Solr source code

Regards,
Gora


Re: DismaxParser Query

2011-01-27 Thread lee carroll
with dismax you get to say things like match all terms if less then 3 terms
entered else match term-x
it produces highly flexible and relevant matches and works very well in lots
of common search usescases. field boosting
allows further tuning.

if you have rigid rules like the last one you quote i don't think dismax is
for you. Although i might be wrong and some one might
be able to help



On 27 January 2011 13:32, Isan Fulia isan.fu...@germinait.com wrote:

 It worked by making mm=0 (it acted as OR operator)
 but how to handle this

 field1:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
 field2:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
 field3:((keyword1 AND keyword2) OR (keyword3 AND keyword4))




 On 27 January 2011 17:06, lee carroll lee.a.carr...@googlemail.com
 wrote:

  sorry ignore that - we are on dismax here - look at mm param in the docs
  you can set this to achieve what you need
 
  On 27 January 2011 11:34, lee carroll lee.a.carr...@googlemail.com
  wrote:
 
   the default operation can be set in your config to be or or on the
  query
   something like q.op=OR
  
  
  
   On 27 January 2011 11:26, Isan Fulia isan.fu...@germinait.com wrote:
  
   but q=keyword1 keyword2  does AND operation  not OR
  
   On 27 January 2011 16:22, lee carroll lee.a.carr...@googlemail.com
   wrote:
  
use dismax q for first three fields and a filter query for the 4th
 and
   5th
fields
so
q=keyword1 keyword 2
qf = field1,feild2,field3
pf = field1,feild2,field3
mm=something sensible for you
defType=dismax
fq= field4:(keyword3 OR keyword4) AND field5:(keyword5)
   
take a look at the dismax docs for extra params
   
   
   
On 27 January 2011 08:52, Isan Fulia isan.fu...@germinait.com
  wrote:
   
 Hi all,
 The query for standard request handler is as follows
 field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2) OR
 field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4)
 AND
 field5:(keyword5)


 How the same above query can be written for dismax request handler

 --
 Thanks  Regards,
 Isan Fulia.

   
  
  
  
   --
   Thanks  Regards,
   Isan Fulia.
  
  
  
 



 --
 Thanks  Regards,
 Isan Fulia.



Import Handler for tokenizing facet string into multi-valued solr.StrField..

2011-01-27 Thread Dennis Schafroth
Hi, 

Pretty novice into SOLR coding, but looking for hints about how (if not already 
done) to implement a PatternTokenizer, that would index this into multivalie 
fields of solr.StrField for facetting. Ex. 

Water -- Irrigation ; Water -- Sewage

should be tokenized into 

Water
Irrigation
Sewage

in multi-valued non-tokenized fields due to performance. I could do it from the 
outside, but I would this as a opportunity to learn about SOLR.

It works as I want with the PatternTokenizerFactory when I am using 
solr.TextField, but not when I am using the non-tokenized solr.StrField. But 
according to reading, facets performance is better on non-tokenized fields. We 
need better performance on our faceted searches on these multi-value fields.  
(25 million documents, three multi-valued facets)

I would also need to have a filter that filter out identical values as the 
feeds have redundant data as shown above.

Can anyone point point me in the right direction..

cheers, 
:-Dennis 

Re: How to find Master Slave are in sync

2011-01-27 Thread Erick Erickson
Let's back up a moment and ask why you are doing this from scripts,
because this feels like an XY problem, see:
http://people.apache.org/~hossman/#xyproblem
http://people.apache.org/~hossman/#xyproblem
What are you trying to accomplish by swapping cores on the master
and slave?

Solr 1.4 has configuration-based replication, are you using 1.4? This
version of Solr automatically, upon replication, switches to the updated
index.

You can trigger a replication either by configuring the polling interval on
the
slave or by sending the proper HTTP request to the slave. See:
http://wiki.apache.org/solr/SolrReplication#HTTP_API

So, it seems like taking charge of swapping cores may be more work
than you really need to do.

Of course, if you're on a different version of Solr, this is irrelevant.

Best
Erick

On Thu, Jan 27, 2011 at 8:38 AM, Shanmugavel SRD
srdshanmuga...@gmail.comwrote:


 Markus,
  The problem here is if I call the below two URLs immediately after
 replication then I am getting both the index versions as same. In my python
 script I have added code to swap the online core on master with offline
 core
 on master and online core on slave with offline core on slave, if both the
 versions are same. After calling swap, I am getting error in slave's log
 like below.
  So I am confused why this is happening. Can you please help me on this?

  http://master_host:port/solr/replication?command=indexversion
  http://slave_host:port/solr/replication?command=details


 2011-01-27 07:45:26,713 WARN  [org.apache.solr.handler.SnapPuller]
 (Thread-59) No content recieved for file: {size=154098810, name=_e3.cfx,
 lastmodified=1296132092000}
 2011-01-27 07:45:27,396 ERROR [org.apache.solr.handler.ReplicationHandler]
 (Thread-59) SnapPull failed

org.apache.solr.common.SolrException: Unable to download _e3.cfx
 completely. Downloaded 0!=154098810
at

 org.apache.solr.handler.SnapPuller$FileFetcher.cleanup(SnapPuller.java:1026)
at

 org.apache.solr.handler.SnapPuller$FileFetcher.fetchFile(SnapPuller.java:906)
at
 org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:541)
at
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:294)

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/How-to-find-Master-Slave-are-in-sync-tp2287014p2362679.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: DismaxParser Query

2011-01-27 Thread Erick Erickson
What version of Solr are you using, and could you consider either 3x or
applying a patch to 1.4.1? Because eDismax (extended dismax) handles the
full Lucene query language and probably works here. See the Solr
JIRA 1553 at https://issues.apache.org/jira/browse/SOLR-1553

Best
Erick

On Thu, Jan 27, 2011 at 8:32 AM, Isan Fulia isan.fu...@germinait.comwrote:

 It worked by making mm=0 (it acted as OR operator)
 but how to handle this

 field1:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
 field2:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
 field3:((keyword1 AND keyword2) OR (keyword3 AND keyword4))




 On 27 January 2011 17:06, lee carroll lee.a.carr...@googlemail.com
 wrote:

  sorry ignore that - we are on dismax here - look at mm param in the docs
  you can set this to achieve what you need
 
  On 27 January 2011 11:34, lee carroll lee.a.carr...@googlemail.com
  wrote:
 
   the default operation can be set in your config to be or or on the
  query
   something like q.op=OR
  
  
  
   On 27 January 2011 11:26, Isan Fulia isan.fu...@germinait.com wrote:
  
   but q=keyword1 keyword2  does AND operation  not OR
  
   On 27 January 2011 16:22, lee carroll lee.a.carr...@googlemail.com
   wrote:
  
use dismax q for first three fields and a filter query for the 4th
 and
   5th
fields
so
q=keyword1 keyword 2
qf = field1,feild2,field3
pf = field1,feild2,field3
mm=something sensible for you
defType=dismax
fq= field4:(keyword3 OR keyword4) AND field5:(keyword5)
   
take a look at the dismax docs for extra params
   
   
   
On 27 January 2011 08:52, Isan Fulia isan.fu...@germinait.com
  wrote:
   
 Hi all,
 The query for standard request handler is as follows
 field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2) OR
 field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4)
 AND
 field5:(keyword5)


 How the same above query can be written for dismax request handler

 --
 Thanks  Regards,
 Isan Fulia.

   
  
  
  
   --
   Thanks  Regards,
   Isan Fulia.
  
  
  
 



 --
 Thanks  Regards,
 Isan Fulia.



AW: DismaxParser Query

2011-01-27 Thread Daniel Pötzinger
It may also be an option to mix the query parsers?
Something like this (not tested):

q={!lucene}field1:test OR field2:test2 _query_:{!dismax qf=fields}+my dismax 
-bad

So you have the benefits of lucene and dismax parser

-Ursprüngliche Nachricht-
Von: Erick Erickson [mailto:erickerick...@gmail.com] 
Gesendet: Donnerstag, 27. Januar 2011 15:15
An: solr-user@lucene.apache.org
Betreff: Re: DismaxParser Query

What version of Solr are you using, and could you consider either 3x or
applying a patch to 1.4.1? Because eDismax (extended dismax) handles the
full Lucene query language and probably works here. See the Solr
JIRA 1553 at https://issues.apache.org/jira/browse/SOLR-1553

Best
Erick

On Thu, Jan 27, 2011 at 8:32 AM, Isan Fulia isan.fu...@germinait.comwrote:

 It worked by making mm=0 (it acted as OR operator)
 but how to handle this

 field1:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
 field2:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
 field3:((keyword1 AND keyword2) OR (keyword3 AND keyword4))




 On 27 January 2011 17:06, lee carroll lee.a.carr...@googlemail.com
 wrote:

  sorry ignore that - we are on dismax here - look at mm param in the docs
  you can set this to achieve what you need
 
  On 27 January 2011 11:34, lee carroll lee.a.carr...@googlemail.com
  wrote:
 
   the default operation can be set in your config to be or or on the
  query
   something like q.op=OR
  
  
  
   On 27 January 2011 11:26, Isan Fulia isan.fu...@germinait.com wrote:
  
   but q=keyword1 keyword2  does AND operation  not OR
  
   On 27 January 2011 16:22, lee carroll lee.a.carr...@googlemail.com
   wrote:
  
use dismax q for first three fields and a filter query for the 4th
 and
   5th
fields
so
q=keyword1 keyword 2
qf = field1,feild2,field3
pf = field1,feild2,field3
mm=something sensible for you
defType=dismax
fq= field4:(keyword3 OR keyword4) AND field5:(keyword5)
   
take a look at the dismax docs for extra params
   
   
   
On 27 January 2011 08:52, Isan Fulia isan.fu...@germinait.com
  wrote:
   
 Hi all,
 The query for standard request handler is as follows
 field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2) OR
 field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4)
 AND
 field5:(keyword5)


 How the same above query can be written for dismax request handler

 --
 Thanks  Regards,
 Isan Fulia.

   
  
  
  
   --
   Thanks  Regards,
   Isan Fulia.
  
  
  
 



 --
 Thanks  Regards,
 Isan Fulia.



Detect Out of Memory Errors

2011-01-27 Thread saureen

Hi,

is ther a way by which i could detect the out of memory errors in solr so
that i could implement some functionality such as restarting the tomcat or
alert me via email whenever such error is detected.?
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Detect-Out-of-Memory-Errors-tp2362872p2362872.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question About Writing Custom Query Parser Plugin

2011-01-27 Thread Ahsan |qbal
Any One

On Thu, Jan 27, 2011 at 1:27 PM, Ahson Iqbal mianah...@yahoo.com wrote:

 Hi All

 I want to integrate lucene Surround Query Parser with solr 1.4.1, and for
 that I
 am writing Custom Query Parser Plugin, To accomplish this task I should
 write a
 sub class of org.apache.solr.search.QParserPlugin and implement its two
 methods

 public void init(NamedList nl)
 public QParser createParser(String string, SolrParams sp, SolrParams sp1,
 SolrQueryRequest sqr)

 now here createParser should return an object of a subclass of
 org.apache.solr.search.QParser, but I need a parser of type
 org.apache.lucene.queryParser.surround.parser.QueryParser which is not a
 subclass of org.apache.solr.search.QParser

 Now my question is should I write a sub class
 of org.apache.solr.search.QParser and internally create an object
 of org.apache.lucene.queryParser.surround.parser.QueryParser and call its
 parse method? if so how the mapping
 org.apache.lucene.queryParser.surround.query.SrndQuery (that is
 returned org.apache.lucene.queryParser.surround.parser.QueryParser )
 would be
 done with org.apache.lucene.search.Query (that should be returned from
 parse
 method of a query parser of type org.apache.solr.search.QParser)

 Thanx
 Ahsan





Re: Question About Writing Custom Query Parser Plugin

2011-01-27 Thread Erik Hatcher
Yes, you need to create both a QParserPlugin and a QParser implementation.  
Look at Solr's own source code for the LuceneQParserPlugin/LuceneQParser and 
built it like that.

Baking the surround query parser into Solr out of the box would be a useful 
contribution, so if you care to give it a little bit of polish/unit testing and 
submit a patch, the community would be thankful :)

Erik

On Jan 27, 2011, at 03:27 , Ahson Iqbal wrote:

 Hi All
 
 I want to integrate lucene Surround Query Parser with solr 1.4.1, and for 
 that I 
 am writing Custom Query Parser Plugin, To accomplish this task I should write 
 a 
 sub class of org.apache.solr.search.QParserPlugin and implement its two 
 methods 
 
 public void init(NamedList nl)
 public QParser createParser(String string, SolrParams sp, SolrParams sp1, 
 SolrQueryRequest sqr)
 
 now here createParser should return an object of a subclass of 
 org.apache.solr.search.QParser, but I need a parser of type 
 org.apache.lucene.queryParser.surround.parser.QueryParser which is not a 
 subclass of org.apache.solr.search.QParser
 
 Now my question is should I write a sub class 
 of org.apache.solr.search.QParser and internally create an object 
 of org.apache.lucene.queryParser.surround.parser.QueryParser and call its 
 parse method? if so how the mapping 
 org.apache.lucene.queryParser.surround.query.SrndQuery (that is 
 returned org.apache.lucene.queryParser.surround.parser.QueryParser ) would 
 be 
 done with org.apache.lucene.search.Query (that should be returned from 
 parse 
 method of a query parser of type org.apache.solr.search.QParser)
 
 Thanx 
 Ahsan
 
 



Re: A Maven archetype that helps packaging Solr as a standalone application embedded in Apache Tomcat

2011-01-27 Thread Paul Libbrecht

Le 27 janv. 2011 à 12:42, Simone Tripodi a écrit :
 thanks a lot for your feedbacks, much more than appreciated! :)

One more anomaly I find: the license is in the output of the pom.xml.
I think this should not be the case.
*my* license should be there, not the license of the archetype. Or?

paul

Re: Tika config in ExtractingRequestHandler

2011-01-27 Thread Adam Estrada
I believe that as along as Tika is included in a folder that is
referenced by solrconfig.xml you should be good. Solr will
automatically throw mime types to Tika for parsing. Can anyone else
add to this?

Thanks,
Adam

On Thu, Jan 27, 2011 at 5:06 AM, Erlend Garåsen e.f.gara...@usit.uio.no wrote:

 The wiki page for the ExtractingRequestHandler says that I can add the
 following configuration:
 str name=tika.config/my/path/to/tika.config/str

 I have tried to google for an example of such a Tika config file, but
 haven't found anything.

 Erlend

 --
 Erlend Garåsen
 Center for Information Technology Services
 University of Oslo
 P.O. Box 1086 Blindern, N-0317 OSLO, Norway
 Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050



Re: A Maven archetype that helps packaging Solr as a standalone application embedded in Apache Tomcat

2011-01-27 Thread Simone Tripodi
Hi Paul,
sorry I'm late but I've been in the middle of a conf call :( On which
IRC server the #solr channel is? I'll reach you ASAP.
Thanks a lot!
Simo

http://people.apache.org/~simonetripodi/
http://www.99soft.org/



On Thu, Jan 27, 2011 at 4:00 PM, Paul Libbrecht p...@hoplahup.net wrote:

 Le 27 janv. 2011 à 12:42, Simone Tripodi a écrit :
 thanks a lot for your feedbacks, much more than appreciated! :)

 One more anomaly I find: the license is in the output of the pom.xml.
 I think this should not be the case.
 *my* license should be there, not the license of the archetype. Or?

 paul


Re: A Maven archetype that helps packaging Solr as a standalone application embedded in Apache Tomcat

2011-01-27 Thread Stefan Matheis
Simo, it's freenode.net

On Thu, Jan 27, 2011 at 4:16 PM, Simone Tripodi simonetrip...@apache.orgwrote:

 Hi Paul,
 sorry I'm late but I've been in the middle of a conf call :( On which
 IRC server the #solr channel is? I'll reach you ASAP.
 Thanks a lot!
 Simo

 http://people.apache.org/~simonetripodi/
 http://www.99soft.org/



 On Thu, Jan 27, 2011 at 4:00 PM, Paul Libbrecht p...@hoplahup.net wrote:
 
  Le 27 janv. 2011 à 12:42, Simone Tripodi a écrit :
  thanks a lot for your feedbacks, much more than appreciated! :)
 
  One more anomaly I find: the license is in the output of the pom.xml.
  I think this should not be the case.
  *my* license should be there, not the license of the archetype. Or?
 
  paul



RE: DismaxParser Query

2011-01-27 Thread Jonathan Rochkind
Yes, I think nested queries are the only way to do that, and yes, nested 
queries like Daniel's example work (I've done it myself).  I haven't really 
tried to get into understanding/demonstrating _exactly_ how the relevance ends 
up working on the overall master query in such a situation, but it sort of 
works. 

(Just note that Daniel's example isn't quite right, I think you need double 
quotes for the nested _query_, just check the wiki page/blog post on nested 
queries). 

Does eDismax handle parens for order of operation too?  If so, eDismax is 
probably the best/easiest solution, especially if you're trying to parse an 
incoming query from some OTHER format and translate it to something that can be 
sent to Solr, which is what I often do. 

I haven't messed with eDismax myself yet.  Does anyone know if there's any easy 
(easy!) way to get eDismax in a Solr 1.4?  Any easy way to compile an eDismax 
query parser on it's own that works with Solr 1.4, and then just drop it into 
your local lib/ for use with an existing Solr 1.4?

Jonathan


From: Daniel Pötzinger [daniel.poetzin...@aoemedia.de]
Sent: Thursday, January 27, 2011 9:26 AM
To: solr-user@lucene.apache.org
Subject: AW: DismaxParser Query

It may also be an option to mix the query parsers?
Something like this (not tested):

q={!lucene}field1:test OR field2:test2 _query_:{!dismax qf=fields}+my dismax 
-bad

So you have the benefits of lucene and dismax parser

-Ursprüngliche Nachricht-
Von: Erick Erickson [mailto:erickerick...@gmail.com]
Gesendet: Donnerstag, 27. Januar 2011 15:15
An: solr-user@lucene.apache.org
Betreff: Re: DismaxParser Query

What version of Solr are you using, and could you consider either 3x or
applying a patch to 1.4.1? Because eDismax (extended dismax) handles the
full Lucene query language and probably works here. See the Solr
JIRA 1553 at https://issues.apache.org/jira/browse/SOLR-1553

Best
Erick

On Thu, Jan 27, 2011 at 8:32 AM, Isan Fulia isan.fu...@germinait.comwrote:

 It worked by making mm=0 (it acted as OR operator)
 but how to handle this

 field1:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
 field2:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
 field3:((keyword1 AND keyword2) OR (keyword3 AND keyword4))




 On 27 January 2011 17:06, lee carroll lee.a.carr...@googlemail.com
 wrote:

  sorry ignore that - we are on dismax here - look at mm param in the docs
  you can set this to achieve what you need
 
  On 27 January 2011 11:34, lee carroll lee.a.carr...@googlemail.com
  wrote:
 
   the default operation can be set in your config to be or or on the
  query
   something like q.op=OR
  
  
  
   On 27 January 2011 11:26, Isan Fulia isan.fu...@germinait.com wrote:
  
   but q=keyword1 keyword2  does AND operation  not OR
  
   On 27 January 2011 16:22, lee carroll lee.a.carr...@googlemail.com
   wrote:
  
use dismax q for first three fields and a filter query for the 4th
 and
   5th
fields
so
q=keyword1 keyword 2
qf = field1,feild2,field3
pf = field1,feild2,field3
mm=something sensible for you
defType=dismax
fq= field4:(keyword3 OR keyword4) AND field5:(keyword5)
   
take a look at the dismax docs for extra params
   
   
   
On 27 January 2011 08:52, Isan Fulia isan.fu...@germinait.com
  wrote:
   
 Hi all,
 The query for standard request handler is as follows
 field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2) OR
 field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4)
 AND
 field5:(keyword5)


 How the same above query can be written for dismax request handler

 --
 Thanks  Regards,
 Isan Fulia.

   
  
  
  
   --
   Thanks  Regards,
   Isan Fulia.
  
  
  
 



 --
 Thanks  Regards,
 Isan Fulia.



Re: Tika config in ExtractingRequestHandler

2011-01-27 Thread Erlend Garåsen


If this configuration file is the same as the tika-mimetypes.xml file 
inside Nutch' conf file, I have an example.


I was trying to implement language detection for Solr and thought I had 
to invoke some Tika functionality by this configuration file in order to 
do so, but found out that I could rewrite some of the 
ExtractingRequestHandler classes instead.


Erlend

On 27.01.11 16.12, Adam Estrada wrote:

I believe that as along as Tika is included in a folder that is
referenced by solrconfig.xml you should be good. Solr will
automatically throw mime types to Tika for parsing. Can anyone else
add to this?

Thanks,
Adam

On Thu, Jan 27, 2011 at 5:06 AM, Erlend Garåsene.f.gara...@usit.uio.no  wrote:


The wiki page for the ExtractingRequestHandler says that I can add the
following configuration:
str name=tika.config/my/path/to/tika.config/str

I have tried to google for an example of such a Tika config file, but
haven't found anything.

Erlend

--
Erlend Garåsen
Center for Information Technology Services
University of Oslo
P.O. Box 1086 Blindern, N-0317 OSLO, Norway
Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050




--
Erlend Garåsen
Center for Information Technology Services
University of Oslo
P.O. Box 1086 Blindern, N-0317 OSLO, Norway
Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050


Re: Import Handler for tokenizing facet string into multi-valued solr.StrField..

2011-01-27 Thread Erick Erickson
Tokenization is fine with facets, that caution is about, say, faceting
on the tokenized body of a document where you have potentially
a huge number of unique tokens.

But if there is a controlled number of distinct values, you shouldn't have
to do anything except index to a tokenized field. I'd remove stemming,
WordDelimiterFactory, etc though, in fact I'd probably just go with
WhiteSpaceTokenizer and, maybe, LowerCaseFilter.

But if you have a huge number of unique values, it doesn't matter whether
they are tokenized or strings, it'll still be a problem.

One note: when faceting for the first time on a newly-started Solr instance,
the caches are filled and the *first* query will be slower, so measure
subsequent queries.

Best
Erick

On Thu, Jan 27, 2011 at 9:09 AM, Dennis Schafroth den...@indexdata.comwrote:

 Hi,

 Pretty novice into SOLR coding, but looking for hints about how (if not
 already done) to implement a PatternTokenizer, that would index this into
 multivalie fields of solr.StrField for facetting. Ex.

 Water -- Irrigation ; Water -- Sewage

 should be tokenized into

 Water
 Irrigation
 Sewage

 in multi-valued non-tokenized fields due to performance. I could do it from
 the outside, but I would this as a opportunity to learn about SOLR.

 It works as I want with the PatternTokenizerFactory when I am using
 solr.TextField, but not when I am using the non-tokenized solr.StrField. But
 according to reading, facets performance is better on non-tokenized fields.
 We need better performance on our faceted searches on these multi-value
 fields.  (25 million documents, three multi-valued facets)

 I would also need to have a filter that filter out identical values as the
 feeds have redundant data as shown above.

 Can anyone point point me in the right direction..

 cheers,
 :-Dennis


Re: Import Handler for tokenizing facet string into multi-valued solr.StrField..

2011-01-27 Thread Erik Hatcher
Beyond what Erick said, I'll add that it is often better to do this from the 
outside and send in multiple actual end-user displayable facet values.  When 
you send in a field like Water -- Irrigation ; Water -- Sewage, that is what 
will get stored (if you have it set to stored), but what you might rather want 
is each individual value stored, which can only be done by the indexer sending 
in multiple values, not through just tokenization.

Erik

On Jan 27, 2011, at 09:09 , Dennis Schafroth wrote:

 Hi, 
 
 Pretty novice into SOLR coding, but looking for hints about how (if not 
 already done) to implement a PatternTokenizer, that would index this into 
 multivalie fields of solr.StrField for facetting. Ex. 
 
 Water -- Irrigation ; Water -- Sewage
 
 should be tokenized into 
 
 Water
 Irrigation
 Sewage
 
 in multi-valued non-tokenized fields due to performance. I could do it from 
 the outside, but I would this as a opportunity to learn about SOLR.
 
 It works as I want with the PatternTokenizerFactory when I am using 
 solr.TextField, but not when I am using the non-tokenized solr.StrField. But 
 according to reading, facets performance is better on non-tokenized fields. 
 We need better performance on our faceted searches on these multi-value 
 fields.  (25 million documents, three multi-valued facets)
 
 I would also need to have a filter that filter out identical values as the 
 feeds have redundant data as shown above.
 
 Can anyone point point me in the right direction..
 
 cheers, 
 :-Dennis



Re: Import Handler for tokenizing facet string into multi-valued solr.StrField..

2011-01-27 Thread Dennis Schafroth
Thanks for the hints! 

Sorry about stealing the thread query range in multivalued date field 
Mistakenly responded to it. 

cheers,
:-Dennis 

On 27/01/2011, at 16.48, Erik Hatcher wrote:

 Beyond what Erick said, I'll add that it is often better to do this from the 
 outside and send in multiple actual end-user displayable facet values.  When 
 you send in a field like Water -- Irrigation ; Water -- Sewage, that is 
 what will get stored (if you have it set to stored), but what you might 
 rather want is each individual value stored, which can only be done by the 
 indexer sending in multiple values, not through just tokenization.
 
   Erik
 
 On Jan 27, 2011, at 09:09 , Dennis Schafroth wrote:
 
 Hi, 
 
 Pretty novice into SOLR coding, but looking for hints about how (if not 
 already done) to implement a PatternTokenizer, that would index this into 
 multivalie fields of solr.StrField for facetting. Ex. 
 
 Water -- Irrigation ; Water -- Sewage
 
 should be tokenized into 
 
 Water
 Irrigation
 Sewage
 
 in multi-valued non-tokenized fields due to performance. I could do it from 
 the outside, but I would this as a opportunity to learn about SOLR.
 
 It works as I want with the PatternTokenizerFactory when I am using 
 solr.TextField, but not when I am using the non-tokenized solr.StrField. But 
 according to reading, facets performance is better on non-tokenized fields. 
 We need better performance on our faceted searches on these multi-value 
 fields.  (25 million documents, three multi-valued facets)
 
 I would also need to have a filter that filter out identical values as the 
 feeds have redundant data as shown above.
 
 Can anyone point point me in the right direction..
 
 cheers, 
 :-Dennis
 
 



EmbeddedSolr issues

2011-01-27 Thread Karthik Manimaran

Hi,

Am getting the following messages while using EmbeddedSolr to retrieve 
the Term Vectors. I also happened to go through 
https://issues.apache.org/jira/browse/SOLR-914 . Should I ignore these 
messages and proceed or should I make any changes?


[#|2011-01-27T11:56:34.593-0500|INFO|glassfish3.0.1|javax.enterprise.system.std.com.sun.enterprise.v3.services.impl|_ThreadId=33|_ThreadName=21687399  
[Finalizer] Error org.apache.solr.core.CoreContainer - CoreContainer was 
not shutdown prior to finalize(), indicates a bug -- POSSIBLE RESOURCE 
LEAK!!!


[#|2011-01-27T11:56:34.609-0500|INFO|glassfish3.0.1|javax.enterprise.system.std.com.sun.enterprise.v3.services.impl|_ThreadId=33|_ThreadName=21687415  
[Finalizer] Error org.apache.solr.core.SolrCore - Too many close 
[count:-1] on org.apache.solr.core.SolrCore@1638e30. Please report this 
exception to solr-user@lucene.apache.org


[#|2011-01-27T11:56:34.611-0500|INFO|glassfish3.0.1|javax.enterprise.system.std.com.sun.enterprise.v3.services.impl|_ThreadId=33|_ThreadName=21687417  
[Finalizer] Error org.apache.solr.core.SolrCore - REFCOUNT ERROR: 
unreferenced org.apache.solr.core.SolrCore@1638e30 (UserIndexCore) has a 
reference count of -1


[#|2011-01-27T11:56:34.613-0500|INFO|glassfish3.0.1|javax.enterprise.system.std.com.sun.enterprise.v3.services.impl|_ThreadId=33|_ThreadName=21687419  
[Finalizer] Error org.apache.solr.common.util.ConcurrentLRUCache - 
ConcurrentLRUCache was not destroyed prior to finalize(), indicates a 
bug -- POSSIBLE RESOURCE LEAK!!!


[#|2011-01-27T11:56:34.613-0500|INFO|glassfish3.0.1|javax.enterprise.system.std.com.sun.enterprise.v3.services.impl|_ThreadId=33|_ThreadName=21687420  
[Finalizer] Error org.apache.solr.common.util.ConcurrentLRUCache - 
ConcurrentLRUCache was not destroyed prior to finalize(), indicates a 
bug -- POSSIBLE RESOURCE LEAK!!!



This is the code am using:
public static SolrCore USER_IDX_CORE;

public static MapString, String getTermsVector(String userId) throws 
ParserConfigurationException, IOException, SAXException {

MapString, String freqMap = new HashMapString, String();
try {
SolrConfig USER_IDX_SOLR_CONFIG = new 
SolrConfig(USER_IDX_CONFIG_FILE);
IndexSchema USER_IDX_SCHEMA = new 
IndexSchema(USER_IDX_SOLR_CONFIG, USER_IDX_SCHEMA_FILE, null);
CoreContainer USER_IDX_CONTAINER = new CoreContainer(new 
SolrResourceLoader(SolrIndexer.USER_IDX_SOLR_HOME));
CoreDescriptor USER_IDX_CORE_DESCRIPTOR = new 
CoreDescriptor(USER_IDX_CONTAINER, USER_IDX_CORE_NAME, 
USER_IDX_SOLR_CONFIG.getResourceLoader()

.getInstanceDir());

USER_IDX_CORE_DESCRIPTOR.setConfigName(USER_IDX_SOLR_CONFIG.getResourceName()); 


USER_IDX_CORE_DESCRIPTOR.setSchemaName(USER_IDX_SCHEMA.getResourceName());
USER_IDX_CORE = new SolrCore(null, USER_IDX_DATA_DIR, 
USER_IDX_SOLR_CONFIG, USER_IDX_SCHEMA, USER_IDX_CORE_DESCRIPTOR);
USER_IDX_CONTAINER.register(USER_IDX_CORE_NAME, 
USER_IDX_CORE, false);
SearchComponent tvComp = 
USER_IDX_CORE.getSearchComponent(tvComponent);


ModifiableSolrParams params = new ModifiableSolrParams();
params.add(CommonParams.Q, FIELD_USER_ID + : + userId);
params.add(CommonParams.QT, tvrh);
params.add(TermVectorParams.TF, true);
params.add(TermVectorComponent.COMPONENT_NAME, true);
SolrRequestHandler handler = 
USER_IDX_CORE.getRequestHandler(tvrh);

SolrQueryResponse rsp;
rsp = new SolrQueryResponse();
rsp.add(responseHeader, new SimpleOrderedMap());
handler.handleRequest(new 
LocalSolrQueryRequest(USER_IDX_CORE, params), rsp);


NamedList terms = (NamedList) ((NamedList) ((NamedList) 
rsp.getValues().get(TermVectorComponent.TERM_VECTORS)).getVal(0)).get(FIELD_USER_ALL); 


if (terms != null) {
for (int i = 0; i  terms.size(); i++) {
NamedList freq = (NamedList) terms.getVal(i);
freqMap.put(terms.getName(i), 
freq.getVal(0).toString());

}
}
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
USER_IDX_CORE.close();
USER_IDX_CONTAINER.shutdown();
} catch (Exception e) {
e.printStackTrace();
}
}

return freqMap;
}

Also, USER_IDX_CONTAINER.shutdown(); throws a NullPointerException 
indicating the reference doesn't exist by the time the code execution 
reaches it.


If I don't use thie snippet
try {
USER_IDX_CORE.close();
USER_IDX_CONTAINER.shutdown();
} catch (Exception e) {
e.printStackTrace();
}
I get a similar POSSIBLE RESORUCE LEAK!!! message that says SolrCore 
wasn't closed.


Am calling this code via a message queue, and no concurrent 

Is relevance score related to position of the term?

2011-01-27 Thread cyang2010

Let me describe the question using an example:

If search Lee on name field as exact term match,

returning result can be:

Lee Jamie
Jamie Lee

Will solr grant higher score to Lee Jamie vs Jamie Lee based on the
position of the term in name field of each document?

From what i know, the score are related to:
1. term frequency
2. idf (inverse document frequency)
3. length norm
4. query norm

It does not seem to care the position of the match term.  Is it right?

Thanks in advance.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-relevance-score-related-to-position-of-the-term-tp2363369p2363369.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Is relevance score related to position of the term?

2011-01-27 Thread Em

Hi Cyang,

usually Solr isn't looking at the position of a term. However, there are
solutions out there for considering the term's position when calculating a
doc's score.

Furthermore: If two docs got the same score, I think they are ordered the
way they were found in the index.

Does this answer your questions?

Regards
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-relevance-score-related-to-position-of-the-term-tp2363369p2363385.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud Questions for MultiCore Setup

2011-01-27 Thread Em

Hi,

excuse me for pushing this for a second time, but I can't figure it out by
looking at the source code...

Thanks!



 Hi Lance, 
 
 thanks for your explanation. 
 
 As far as I know in distributed search i have to tell Solr what other
 shards it has to query. So, if I want to query a specific core, present in
 all my shards, i could tell Solr this by using the shards-param plus
 specified core on each shard. 
 
 Using SolrCloud's distrib=true feature (it sets all the known shards
 automatically?), a collection should consist only of one type of
 core-schema, correct? 
 How does SolrCloud knows that shard_x and shard_y are replicas of
 eachother (I took a look at the  possibility to specify alternative shards
 if one is not available)? If it does not know that they are replicas of
 eachother, I should use the syntax of specifying alternative shards for
 failover due to performance-reasons, because querying 2 identical and
 available cores seems to be wasted capacity, no? 
 
 Thank you!
 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Questions-for-MultiCore-Setup-tp2309443p2363396.html
Sent from the Solr - User mailing list archive at Nabble.com.


disappearing MBeans

2011-01-27 Thread matthew sporleder
I am using JMX to monitor my replication status and am finding that my
MBeans are disappearing.  I turned on debugging for JMX and found that
solr seems to be deleting the mbeans.

Is this a bug?  Some trace info is below..

here's me reading the mbean successfully:
Jan 27, 2011 5:00:02 PM ServerCommunicatorAdmin reqIncoming
FINER: Receive a new request.
Jan 27, 2011 5:00:02 PM DefaultMBeanServerInterceptor getAttribute
FINER: Attribute= indexReplicatedAt, obj=
solr/myapp-core:type=/replication,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:00:02 PM Repository retrieve
FINER: 
name=solr/myapp-core:type=/replication,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:00:02 PM ServerCommunicatorAdmin reqIncoming
FINER: Finish a request.


a little while later it removes the mbean from the PM Repository
(whatever that is) and then re-adds it:
FINER: Send create notification of object
solr/myapp-core:id=org.apache.solr.handler.component.SearchHandler,type=atlas
Jan 27, 2011 5:16:14 PM DefaultMBeanServerInterceptor sendNotification
FINER: JMX.mbean.registered
solr/myapp-core:type=atlas,id=org.apache.solr.handler.component.SearchHandler
Jan 27, 2011 5:16:14 PM Repository contains
FINER: 
name=solr/myapp-core:type=/replication,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:14 PM Repository retrieve
FINER: 
name=solr/myapp-core:type=/replication,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:14 PM Repository remove
FINER: 
name=solr/myapp-core:type=/replication,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:14 PM DefaultMBeanServerInterceptor unregisterMBean
FINER: Send delete notification of object
solr/myapp-core:id=org.apache.solr.handler.ReplicationHandler,type=/replication
Jan 27, 2011 5:16:14 PM DefaultMBeanServerInterceptor sendNotification
FINER: JMX.mbean.unregistered
solr/myapp-core:type=/replication,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:14 PM DefaultMBeanServerInterceptor registerMBean
FINER: ObjectName =
solr/myapp-core:type=/replication,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:14 PM Repository addMBean
FINER: 
name=solr/myapp-core:type=/replication,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:14 PM DefaultMBeanServerInterceptor addObject
FINER: Send create notification of object
solr/myapp-core:id=org.apache.solr.handler.ReplicationHandler,type=/replication
Jan 27, 2011 5:16:14 PM DefaultMBeanServerInterceptor sendNotification
FINER: JMX.mbean.registered
solr/myapp-core:type=/replication,id=org.apache.solr.handler.ReplicationHandler


And after a tons of messages but still in the same second it does:
Jan 27, 2011 5:16:14 PM Repository contains
FINER: 
name=solr/myapp-core:type=org.apache.solr.handler.ReplicationHandler,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:14 PM Repository retrieve
FINER: 
name=solr/myapp-core:type=org.apache.solr.handler.ReplicationHandler,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:14 PM Repository removeFINER:
name=solr/myapp-core:type=org.apache.solr.handler.ReplicationHandler,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:14 PM DefaultMBeanServerInterceptor unregisterMBean
FINER: Send delete notification of object
solr/myapp-core:id=org.apache.solr.handler.ReplicationHandler,type=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:14 PM DefaultMBeanServerInterceptor sendNotification
FINER: JMX.mbean.unregistered
solr/myapp-core:type=org.apache.solr.handler.ReplicationHandler,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:14 PM DefaultMBeanServerInterceptor registerMBean
FINER: ObjectName =
solr/myapp-core:type=org.apache.solr.handler.ReplicationHandler,id=org.apache.solr.handler.ReplicationHandlerJan
27, 2011 5:16:14 PM Repository addMBean
FINER: 
name=solr/myapp-core:type=org.apache.solr.handler.ReplicationHandler,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:14 PM DefaultMBeanServerInterceptor addObjectFINER:
Send create notification of object
solr/myapp-core:id=org.apache.solr.handler.ReplicationHandler,type=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:14 PM DefaultMBeanServerInterceptor
sendNotificationFINER: JMX.mbean.registered
solr/myapp-core:type=org.apache.solr.handler.ReplicationHandler,id=org.apache.solr.handler.ReplicationHandler


And then I don't know what this is about but it removes the bean again:
Jan 27, 2011 5:16:15 PM Repository contains
FINER: 
name=solr/myapp-core:type=org.apache.solr.handler.ReplicationHandler,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:15 PM Repository retrieve
FINER: 
name=solr/myapp-core:type=org.apache.solr.handler.ReplicationHandler,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 5:16:15 PM Repository remove
FINER: 
name=solr/myapp-core:type=org.apache.solr.handler.ReplicationHandler,id=org.apache.solr.handler.ReplicationHandler
Jan 27, 2011 

Re: configure httpclient to access solr with user credential on third party host

2011-01-27 Thread Darniz

thanks exaclty i asked my domain hosting provider and he provided me with
some other port

i am wondering can i specify credentials without the port

i mean when i open the browser and i type 
www.mydomainmame/solr i get the tomcat auth login screen.

in the same way can i configure the http client so that i dont have to
specify the port 

Thanks
darniz
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/configure-httpclient-to-access-solr-with-user-credential-on-third-party-host-tp2360364p2364190.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: DismaxParser Query

2011-01-27 Thread Erick Erickson
In general, patches are applied to the source tree and it's re-compiled.
See: http://wiki.apache.org/solr/HowToContribute#Working_With_Patches

This is pretty easy, and I do know that some people have applied the
eDismax
patch to the 1.4 code line, but I haven't done it myself.

Best
Erick

On Thu, Jan 27, 2011 at 10:27 AM, Jonathan Rochkind rochk...@jhu.eduwrote:

 Yes, I think nested queries are the only way to do that, and yes, nested
 queries like Daniel's example work (I've done it myself).  I haven't really
 tried to get into understanding/demonstrating _exactly_ how the relevance
 ends up working on the overall master query in such a situation, but it sort
 of works.

 (Just note that Daniel's example isn't quite right, I think you need double
 quotes for the nested _query_, just check the wiki page/blog post on nested
 queries).

 Does eDismax handle parens for order of operation too?  If so, eDismax is
 probably the best/easiest solution, especially if you're trying to parse an
 incoming query from some OTHER format and translate it to something that can
 be sent to Solr, which is what I often do.

 I haven't messed with eDismax myself yet.  Does anyone know if there's any
 easy (easy!) way to get eDismax in a Solr 1.4?  Any easy way to compile an
 eDismax query parser on it's own that works with Solr 1.4, and then just
 drop it into your local lib/ for use with an existing Solr 1.4?

 Jonathan

 
 From: Daniel Pötzinger [daniel.poetzin...@aoemedia.de]
 Sent: Thursday, January 27, 2011 9:26 AM
 To: solr-user@lucene.apache.org
 Subject: AW: DismaxParser Query

 It may also be an option to mix the query parsers?
 Something like this (not tested):

 q={!lucene}field1:test OR field2:test2 _query_:{!dismax qf=fields}+my
 dismax -bad

 So you have the benefits of lucene and dismax parser

 -Ursprüngliche Nachricht-
 Von: Erick Erickson [mailto:erickerick...@gmail.com]
 Gesendet: Donnerstag, 27. Januar 2011 15:15
 An: solr-user@lucene.apache.org
 Betreff: Re: DismaxParser Query

 What version of Solr are you using, and could you consider either 3x or
 applying a patch to 1.4.1? Because eDismax (extended dismax) handles the
 full Lucene query language and probably works here. See the Solr
 JIRA 1553 at https://issues.apache.org/jira/browse/SOLR-1553

 Best
 Erick

 On Thu, Jan 27, 2011 at 8:32 AM, Isan Fulia isan.fu...@germinait.com
 wrote:

  It worked by making mm=0 (it acted as OR operator)
  but how to handle this
 
  field1:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
  field2:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
  field3:((keyword1 AND keyword2) OR (keyword3 AND keyword4))
 
 
 
 
  On 27 January 2011 17:06, lee carroll lee.a.carr...@googlemail.com
  wrote:
 
   sorry ignore that - we are on dismax here - look at mm param in the
 docs
   you can set this to achieve what you need
  
   On 27 January 2011 11:34, lee carroll lee.a.carr...@googlemail.com
   wrote:
  
the default operation can be set in your config to be or or on the
   query
something like q.op=OR
   
   
   
On 27 January 2011 11:26, Isan Fulia isan.fu...@germinait.com
 wrote:
   
but q=keyword1 keyword2  does AND operation  not OR
   
On 27 January 2011 16:22, lee carroll lee.a.carr...@googlemail.com
 
wrote:
   
 use dismax q for first three fields and a filter query for the 4th
  and
5th
 fields
 so
 q=keyword1 keyword 2
 qf = field1,feild2,field3
 pf = field1,feild2,field3
 mm=something sensible for you
 defType=dismax
 fq= field4:(keyword3 OR keyword4) AND field5:(keyword5)

 take a look at the dismax docs for extra params



 On 27 January 2011 08:52, Isan Fulia isan.fu...@germinait.com
   wrote:

  Hi all,
  The query for standard request handler is as follows
  field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2)
 OR
  field3:(keyword1 OR keyword2) AND field4:(keyword3 OR keyword4)
  AND
  field5:(keyword5)
 
 
  How the same above query can be written for dismax request
 handler
 
  --
  Thanks  Regards,
  Isan Fulia.
 

   
   
   
--
Thanks  Regards,
Isan Fulia.
   
   
   
  
 
 
 
  --
  Thanks  Regards,
  Isan Fulia.
 



Re: Is relevance score related to position of the term?

2011-01-27 Thread cyang2010

Hi Em,

Thanks for reply.

Basically you are saying there is no builtin solution that care about the
position of the term to impact the relevancy score.  In my scenario, i will
get those two document with the same score.   The order depends on the
sequence of indexing.

Thanks,



Cyang
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-relevance-score-related-to-position-of-the-term-tp2363369p2364427.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Is relevance score related to position of the term?

2011-01-27 Thread cyang2010

Just a little clarification, when i say position of the term, i mean the
position of the term within the field.

For example, 

Jamie Lee  -- Lee is the second position of the name field.

Lee Jamie  -- Lee is the first position of the name field in this case.

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-relevance-score-related-to-position-of-the-term-tp2363369p2364431.html
Sent from the Solr - User mailing list archive at Nabble.com.


Searching for negative numbers very slow

2011-01-27 Thread Simon Wistow
If I do 

qt=dismax
fq=uid:1

(or any other positive number) then queries are as quick as normal - in 
the 20ms range. 

However, any of

fq=uid:\-1

or

fq=uid:[* TO -1]

or 
   
fq=uid:[-1 to -1]

or

fq=-uid:[0 TO *]

then queries are incredibly slow - in the 9 *second* range.

Anything I can do to mitigate this? Negative numbers have significant 
meaning in our system so it wouldn't be trivial to shift all uids up by 
the number of negative ids.


Thanks, 

Simon




Re: Possible Memory Leaks / Upgrading to a Later Version of Solr or Lucene

2011-01-27 Thread Simon Wistow
On Tue, Jan 25, 2011 at 01:28:16PM +0100, Markus Jelsma said:
 Are you sure you need CMS incremental mode? It's only adviced when running on 
 a machine with one or two processors. If you have more you should consider 
 disabling the incremental flags.

I'll test agin but we added those to get better performance - not much 
but there did seem to be an improvement.

The problem seems to not be in average use but that occasionally there's 
huge spike in load (there doesn't seem to be a particular killer 
query) and Solr just never recovers.

Thanks,

Simon




Re: Import Handler for tokenizing facet string into multi-valued solr.StrField..

2011-01-27 Thread Chris Hostetter

: Subject: Import Handler for tokenizing facet string into multi-valued
: solr.StrField.. 
: In-Reply-To: 1296123345064-2361292.p...@n3.nabble.com
: References: 1296123345064-2361292.p...@n3.nabble.com


-Hoss


Re: DIH clean=false

2011-01-27 Thread Chris Hostetter

: Then for clean=false, my understanding is that it won't blow off existing
: index.   For data that exist in index and db table (by the same uniqueKey)
: it will update the index data regardless if there is actual field update. 
: For existing index data but not existing in table (by comparing uniqueKey),

if clean=false, the documents from your DB are indexed -- if you have a 
uniqueKey field, then docs with the same uniqueKey as an existing doc wil 
overwrite the existing doc.  but nothing will be deleted (so documents you 
removed from your DB will still live on in your index)

clean=true is just another way of saying delete all docs from the index 
before doing this import


-Hoss


Solr for noSQL

2011-01-27 Thread Jianbin Dai
Hi,

 

Do we have data import handler to fast read in data from noSQL database,
specifically, MongoDB I am thinking to use? 

Or a more general question, how does Solr work with noSQL database?

Thanks.

 

Jianbin

 



Re: Searching for negative numbers very slow

2011-01-27 Thread Simon Wistow
On Thu, Jan 27, 2011 at 11:32:26PM +, me said:
 If I do 
 
   qt=dismax
 fq=uid:1
 
 (or any other positive number) then queries are as quick as normal - in 
 the 20ms range. 

For what it's worth uid is a TrieIntField with precisionStep=0,
omitNorms=true, positionIncrementGap=0




Re: configure httpclient to access solr with user credential on third party host

2011-01-27 Thread Jayendra Patil
This should help 

HttpClient client = new HttpClient();
client.getParams().setAuthenticationPreemptive(true);
AuthScope scope = new AuthScope(AuthScope.ANY_HOST,AuthScope.ANY_PORT);
client.getState().setCredentials(scope, new
UsernamePasswordCredentials(user, password));

Regards,
Jayendra

On Thu, Jan 27, 2011 at 4:47 PM, Darniz rnizamud...@edmunds.com wrote:

 thanks exaclty i asked my domain hosting provider and he provided me with
 some other port

 i am wondering can i specify credentials without the port

 i mean when i open the browser and i type
 www.mydomainmame/solr i get the tomcat auth login screen.

 in the same way can i configure the http client so that i dont have to
 specify the port

 Thanks
 darniz
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/configure-httpclient-to-access-solr-with-user-credential-on-third-party-host-tp2360364p2364190.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Tika config in ExtractingRequestHandler

2011-01-27 Thread Lance Norskog
The tika.config file is obsolete. I don't know what replaces it.

On 1/27/11, Erlend Garåsen e.f.gara...@usit.uio.no wrote:

 If this configuration file is the same as the tika-mimetypes.xml file
 inside Nutch' conf file, I have an example.

 I was trying to implement language detection for Solr and thought I had
 to invoke some Tika functionality by this configuration file in order to
 do so, but found out that I could rewrite some of the
 ExtractingRequestHandler classes instead.

 Erlend

 On 27.01.11 16.12, Adam Estrada wrote:
 I believe that as along as Tika is included in a folder that is
 referenced by solrconfig.xml you should be good. Solr will
 automatically throw mime types to Tika for parsing. Can anyone else
 add to this?

 Thanks,
 Adam

 On Thu, Jan 27, 2011 at 5:06 AM, Erlend Garåsene.f.gara...@usit.uio.no
 wrote:

 The wiki page for the ExtractingRequestHandler says that I can add the
 following configuration:
 str name=tika.config/my/path/to/tika.config/str

 I have tried to google for an example of such a Tika config file, but
 haven't found anything.

 Erlend

 --
 Erlend Garåsen
 Center for Information Technology Services
 University of Oslo
 P.O. Box 1086 Blindern, N-0317 OSLO, Norway
 Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP:
 31050



 --
 Erlend Garåsen
 Center for Information Technology Services
 University of Oslo
 P.O. Box 1086 Blindern, N-0317 OSLO, Norway
 Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050



-- 
Lance Norskog
goks...@gmail.com


Re: Solr for noSQL

2011-01-27 Thread Lance Norskog
There no special connectors available to read from the key-value
stores like memcache/cassandra/mongodb. You would have to get a Java
client library for the DB and code your own dataimporthandler
datasource.  I cannot recommend this; you should make your own program
to read data and upload to Solr with one of the Solr client libraries.

Lance

On 1/27/11, Jianbin Dai j...@huawei.com wrote:
 Hi,



 Do we have data import handler to fast read in data from noSQL database,
 specifically, MongoDB I am thinking to use?

 Or a more general question, how does Solr work with noSQL database?

 Thanks.



 Jianbin






-- 
Lance Norskog
goks...@gmail.com


Re: SolrCloud Questions for MultiCore Setup

2011-01-27 Thread Lance Norskog
Hello-

I have not used SolrCloud.

On 1/27/11, Em mailformailingli...@yahoo.de wrote:

 Hi,

 excuse me for pushing this for a second time, but I can't figure it out by
 looking at the source code...

 Thanks!



 Hi Lance,

 thanks for your explanation.

 As far as I know in distributed search i have to tell Solr what other
 shards it has to query. So, if I want to query a specific core, present in
 all my shards, i could tell Solr this by using the shards-param plus
 specified core on each shard.

 Using SolrCloud's distrib=true feature (it sets all the known shards
 automatically?), a collection should consist only of one type of
 core-schema, correct?
 How does SolrCloud knows that shard_x and shard_y are replicas of
 eachother (I took a look at the  possibility to specify alternative shards
 if one is not available)? If it does not know that they are replicas of
 eachother, I should use the syntax of specifying alternative shards for
 failover due to performance-reasons, because querying 2 identical and
 available cores seems to be wasted capacity, no?

 Thank you!

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/SolrCloud-Questions-for-MultiCore-Setup-tp2309443p2363396.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Lance Norskog
goks...@gmail.com


Re: DismaxParser Query

2011-01-27 Thread Isan Fulia
Hi all,
I am currently using solr1.4.1 .Do  I need to apply patch for extended
dismax parser.

On 28 January 2011 03:42, Erick Erickson erickerick...@gmail.com wrote:

 In general, patches are applied to the source tree and it's re-compiled.
 See: http://wiki.apache.org/solr/HowToContribute#Working_With_Patches

 This is pretty easy, and I do know that some people have applied the
 eDismax
 patch to the 1.4 code line, but I haven't done it myself.

 Best
 Erick

 On Thu, Jan 27, 2011 at 10:27 AM, Jonathan Rochkind rochk...@jhu.edu
 wrote:

  Yes, I think nested queries are the only way to do that, and yes, nested
  queries like Daniel's example work (I've done it myself).  I haven't
 really
  tried to get into understanding/demonstrating _exactly_ how the relevance
  ends up working on the overall master query in such a situation, but it
 sort
  of works.
 
  (Just note that Daniel's example isn't quite right, I think you need
 double
  quotes for the nested _query_, just check the wiki page/blog post on
 nested
  queries).
 
  Does eDismax handle parens for order of operation too?  If so, eDismax is
  probably the best/easiest solution, especially if you're trying to parse
 an
  incoming query from some OTHER format and translate it to something that
 can
  be sent to Solr, which is what I often do.
 
  I haven't messed with eDismax myself yet.  Does anyone know if there's
 any
  easy (easy!) way to get eDismax in a Solr 1.4?  Any easy way to compile
 an
  eDismax query parser on it's own that works with Solr 1.4, and then just
  drop it into your local lib/ for use with an existing Solr 1.4?
 
  Jonathan
 
  
  From: Daniel Pötzinger [daniel.poetzin...@aoemedia.de]
  Sent: Thursday, January 27, 2011 9:26 AM
  To: solr-user@lucene.apache.org
  Subject: AW: DismaxParser Query
 
  It may also be an option to mix the query parsers?
  Something like this (not tested):
 
  q={!lucene}field1:test OR field2:test2 _query_:{!dismax qf=fields}+my
  dismax -bad
 
  So you have the benefits of lucene and dismax parser
 
  -Ursprüngliche Nachricht-
  Von: Erick Erickson [mailto:erickerick...@gmail.com]
  Gesendet: Donnerstag, 27. Januar 2011 15:15
  An: solr-user@lucene.apache.org
  Betreff: Re: DismaxParser Query
 
  What version of Solr are you using, and could you consider either 3x or
  applying a patch to 1.4.1? Because eDismax (extended dismax) handles the
  full Lucene query language and probably works here. See the Solr
  JIRA 1553 at https://issues.apache.org/jira/browse/SOLR-1553
 
  Best
  Erick
 
  On Thu, Jan 27, 2011 at 8:32 AM, Isan Fulia isan.fu...@germinait.com
  wrote:
 
   It worked by making mm=0 (it acted as OR operator)
   but how to handle this
  
   field1:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
   field2:((keyword1 AND keyword2) OR (keyword3 AND keyword4)) OR
   field3:((keyword1 AND keyword2) OR (keyword3 AND keyword4))
  
  
  
  
   On 27 January 2011 17:06, lee carroll lee.a.carr...@googlemail.com
   wrote:
  
sorry ignore that - we are on dismax here - look at mm param in the
  docs
you can set this to achieve what you need
   
On 27 January 2011 11:34, lee carroll lee.a.carr...@googlemail.com
wrote:
   
 the default operation can be set in your config to be or or on
 the
query
 something like q.op=OR



 On 27 January 2011 11:26, Isan Fulia isan.fu...@germinait.com
  wrote:

 but q=keyword1 keyword2  does AND operation  not OR

 On 27 January 2011 16:22, lee carroll 
 lee.a.carr...@googlemail.com
  
 wrote:

  use dismax q for first three fields and a filter query for the
 4th
   and
 5th
  fields
  so
  q=keyword1 keyword 2
  qf = field1,feild2,field3
  pf = field1,feild2,field3
  mm=something sensible for you
  defType=dismax
  fq= field4:(keyword3 OR keyword4) AND field5:(keyword5)
 
  take a look at the dismax docs for extra params
 
 
 
  On 27 January 2011 08:52, Isan Fulia isan.fu...@germinait.com
wrote:
 
   Hi all,
   The query for standard request handler is as follows
   field1:(keyword1 OR keyword2) OR field2:(keyword1 OR keyword2)
  OR
   field3:(keyword1 OR keyword2) AND field4:(keyword3 OR
 keyword4)
   AND
   field5:(keyword5)
  
  
   How the same above query can be written for dismax request
  handler
  
   --
   Thanks  Regards,
   Isan Fulia.
  
 



 --
 Thanks  Regards,
 Isan Fulia.



   
  
  
  
   --
   Thanks  Regards,
   Isan Fulia.
  
 




-- 
Thanks  Regards,
Isan Fulia.


Re: Solr for noSQL

2011-01-27 Thread Dennis Gearon
Why not make one's own DIH handler, Lance?

 Dennis Gearon


Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



- Original Message 
From: Lance Norskog goks...@gmail.com
To: solr-user@lucene.apache.org
Sent: Thu, January 27, 2011 9:33:25 PM
Subject: Re: Solr for noSQL

There no special connectors available to read from the key-value
stores like memcache/cassandra/mongodb. You would have to get a Java
client library for the DB and code your own dataimporthandler
datasource.  I cannot recommend this; you should make your own program
to read data and upload to Solr with one of the Solr client libraries.

Lance

On 1/27/11, Jianbin Dai j...@huawei.com wrote:
 Hi,



 Do we have data import handler to fast read in data from noSQL database,
 specifically, MongoDB I am thinking to use?

 Or a more general question, how does Solr work with noSQL database?

 Thanks.



 Jianbin






-- 
Lance Norskog
goks...@gmail.com



Re: Solr for noSQL

2011-01-27 Thread Dai Jianbin 00901725
Do we have performance measurement? Would it be much slower compared to other 
DIH?


 There no special connectors available to read from the key-value
 stores like memcache/cassandra/mongodb. You would have to get a Java
 client library for the DB and code your own dataimporthandler
 datasource.  I cannot recommend this; you should make your own program
 to read data and upload to Solr with one of the Solr client libraries.
 
 Lance
 
 On 1/27/11, Jianbin Dai j...@huawei.com wrote:
  Hi,
 
 
 
  Do we have data import handler to fast read in data from noSQL 
 database, specifically, MongoDB I am thinking to use?
 
  Or a more general question, how does Solr work with noSQL database?
 
  Thanks.
 
 
 
  Jianbin
 
 
 
 
 
 
 -- 
 Lance Norskog
 goks...@gmail.com
 


NOT operator not working

2011-01-27 Thread abhayd

i have a field in xml file DeviceTypeAccessory Data / Memory/DeviceType 
solr schema field declared as   field name=deviceType type=text
indexed=true stored=true /

I am trying to eliminate results by using NOT. For example I want all
devices for a term except where DeviceType is not Accessory*

SO here is what i m trying
/solr/select?indent=onversion=2.2q=(sharp+AND+-deviceType:Access*)qt=dismaxwt=standard

But for some reason its giving me all results for sharp irrespective of
what devicetype is

It works fine with fq=-deviceType:Accessory but due to some other
application constraint we want to use 
q=(sharp+AND+-deviceType:Access*)

Any thoughts what i m doing wrong?
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/NOT-operator-not-working-tp2365831p2365831.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: NOT operator not working

2011-01-27 Thread Ahmet Arslan


--- On Fri, 1/28/11, abhayd ajdabhol...@hotmail.com wrote:

 From: abhayd ajdabhol...@hotmail.com
 Subject: NOT operator not working
 To: solr-user@lucene.apache.org
 Date: Friday, January 28, 2011, 8:45 AM
 
 i have a field in xml file DeviceTypeAccessory Data
 / Memory/DeviceType 
 solr schema field declared as   field
 name=deviceType type=text
 indexed=true stored=true /
 
 I am trying to eliminate results by using NOT. For example
 I want all
 devices for a term except where DeviceType is not
 Accessory*
 
 SO here is what i m trying
 /solr/select?indent=onversion=2.2q=(sharp+AND+-deviceType:Access*)qt=dismaxwt=standard
 
 But for some reason its giving me all results for sharp
 irrespective of
 what devicetype is
 
 It works fine with fq=-deviceType:Accessory but due to some
 other
 application constraint we want to use 
 q=(sharp+AND+-deviceType:Access*)

Wildcard queries are not analyzed. For example if you have an lowercase filter 
at index time, you should lowercase your query manually.
Instead of fq=-deviceType:Access* you should use fq=-deviceType:access*





Re: Solr for noSQL

2011-01-27 Thread Gora Mohanty
On Fri, Jan 28, 2011 at 6:00 AM, Jianbin Dai j...@huawei.com wrote:
[...]
 Do we have data import handler to fast read in data from noSQL database,
 specifically, MongoDB I am thinking to use?
[...]

Have you tried the links that a Google search turns up? Some of
them look like pretty good prospects.

Regards,
Gora


Re: Is relevance score related to position of the term?

2011-01-27 Thread Em

Hi,

no, you missunderstood me, I only said that Solr does not care of the
positions *usually*.

Lucene got SpanNearQuery which considers the position of the Query's terms
relative to eachother.
Furthermore there exists a SpanFirstQuery which boosts occurences of a Term
at the beginning of a special field.

Unfortunately I am unaware whether they are already utilized as a
Solr-Feature or not. 

Perhaps you will need to write your own QueryParserPlugin to make usage of
them for your usecase.

However, Plugins like DisMax do not care whether the found Term is at the
beginning of the field or not. 
BUT you can specify a slop between terms of phrase-queries for boosting.
Have a look at the Wiki's DisMax-page.

Regards

cyang2010 wrote:
 
 Just a little clarification, when i say position of the term, i mean the
 position of the term within the field.
 
 For example, 
 
 Jamie Lee  -- Lee is the second position of the name field.
 
 Lee Jamie  -- Lee is the first position of the name field in this case.
 
 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-relevance-score-related-to-position-of-the-term-tp2363369p2365863.html
Sent from the Solr - User mailing list archive at Nabble.com.