Re: Question on wildcard queries, filters, scoring and TooManyClauses exception

2013-08-16 Thread Ian Lea
I can't explain all of it and 3.0 is way old ... you might like to
think about upgrading.

However in your first snippet you don't need the query AND the filter.
 Either one will suffice.  In some circumstances, as you say, filters
are preferable but queries and filters are often interchangeable.

On the rest of it, I don't know what is going on.  What java classes
are you getting back from QueryParser?  Giving it a variable name of
prefixQuery doesn't make it so - what does
prefixQuery.getClass().getName() say?


--
Ian.


On Thu, Aug 15, 2013 at 6:43 PM, Bill Chesky
 wrote:
> Hello,
>
> I know this is a perennial question here because I've spent a lot of time 
> searching for an answer.  I've seen the discussions about the TooManyClauses 
> exception and I understand generally why you get the it.  I see lots of 
> discussion about using filters to avoid it but I still can't get it to work.  
> I think I'm just missing something fundamental.
>
> I'm using Lucene 3.0.
>
> I'm trying to do prefix queries on an index.  I figured there might be times 
> where I might run into the TooManyClauses exception so from reading 
> discussions on the issue I figured I should use a filter.  I found the 
> PrefixFilter class and began experimenting with it.  E.g. this works:
>
> QueryParser queryParser = new QueryParser(Version.LUCENE_30, "my_field", new 
> StandardAnalyzer(Version.LUCENE_30));
> Query prefixQuery = queryParser.parse("t*");
> PrefixFilter prefixFilter = new PrefixFilter(new Term("my_field", "t"));
> indexSearcher.search(prefixQuery, prefixFilter, collector);
>
> This returns about 5000 hits on my index.
>
> But then I discovered that it works just as well without the filter:
>
> QueryParser queryParser = new QueryParser(Version.LUCENE_30, "my_field", new 
> StandardAnalyzer(Version.LUCENE_30));
> Query prefixQuery = queryParser.parse("t*");
> indexSearcher.search(prefixQuery, collector);
>
> Why, I don't know.  Seems like this would get expanded out into 5000 
> BooleanQueries and since my max clause count is still set to the default 1024 
> I should get the exception.  But I didn't.  So maybe I don't need the filter 
> after all?
>
> Next, I need scoring to work.  I read that with wildcard queries all scores 
> are set to 1.0 by default.  But I read you can use the 
> QueryParser.setMultiTermRewriteMethod() method to take scoring into account 
> again.  So I tried:
>
> QueryParser queryParser = new QueryParser(Version.LUCENE_30, "my_field", new 
> StandardAnalyzer(Version.LUCENE_30));
> queryParser.setMultiTermRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE);
> Query prefixQuery = queryParser.parse("t*");
> indexSearcher.search(prefixQuery, collector);
>
> Now, I get the TooManyClauses exception.
>
> I tried adding the PrefixFilter back in but with no luck.  Still get the 
> exception.
>
> Again, sorry if this has been discussed before.  Just not seeing an answer to 
> this after much searching and I just don't understand what is going on here.  
> Any help appreciated.  Links welcome.
>
> Bill
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: Question on wildcard queries, filters, scoring and TooManyClauses exception

2013-08-16 Thread Bill Chesky
Thanks for the reply Ian.

> I can't explain all of it and 3.0 is way old ... you might like to
> think about upgrading.

Yes, I agree but since there's a significant code base in place, it's a bigger 
project than I can take on at the moment.

> However in your first snippet you don't need the query AND the filter.
> Either one will suffice.  In some circumstances, as you say, filters
> are preferable but queries and filters are often interchangeable.

Yeah, that occurred to me too.  But how do I use only the filter?  The 
IndexSearcher class (in the version of Lucene I'm using anyway) does not have a 
search method that takes only a Filter.  The closest it has is:

IndexSearcher.search(Weight, Filter, Collector)

The documentation on the Weight parameter is kind of sparse.  In the javadoc 
for that method it just says:

weight - to match documents

Looking at the Weight class, it says to instantiate a Weight instance with 
Query.createWeight(Searcher).  So now I'm back to having to have a query again. 
 

> On the rest of it, I don't know what is going on.  What java classes
> are you getting back from QueryParser?  Giving it a variable name of
> prefixQuery doesn't make it so - what does
> prefixQuery.getClass().getName() say?

It's definitely returning a PrefixQuery.  I checked that early on.

thanks again,

Bill

--
Ian.


On Thu, Aug 15, 2013 at 6:43 PM, Bill Chesky
 wrote:
> Hello,
>
> I know this is a perennial question here because I've spent a lot of time 
> searching for an answer.  I've seen the discussions about the TooManyClauses 
> exception and I understand generally why you get the it.  I see lots of 
> discussion about using filters to avoid it but I still can't get it to work.  
> I think I'm just missing something fundamental.
>
> I'm using Lucene 3.0.
>
> I'm trying to do prefix queries on an index.  I figured there might be times 
> where I might run into the TooManyClauses exception so from reading 
> discussions on the issue I figured I should use a filter.  I found the 
> PrefixFilter class and began experimenting with it.  E.g. this works:
>
> QueryParser queryParser = new QueryParser(Version.LUCENE_30, "my_field", new 
> StandardAnalyzer(Version.LUCENE_30));
> Query prefixQuery = queryParser.parse("t*");
> PrefixFilter prefixFilter = new PrefixFilter(new Term("my_field", "t"));
> indexSearcher.search(prefixQuery, prefixFilter, collector);
>
> This returns about 5000 hits on my index.
>
> But then I discovered that it works just as well without the filter:
>
> QueryParser queryParser = new QueryParser(Version.LUCENE_30, "my_field", new 
> StandardAnalyzer(Version.LUCENE_30));
> Query prefixQuery = queryParser.parse("t*");
> indexSearcher.search(prefixQuery, collector);
>
> Why, I don't know.  Seems like this would get expanded out into 5000 
> BooleanQueries and since my max clause count is still set to the default 1024 
> I should get the exception.  But I didn't.  So maybe I don't need the filter 
> after all?
>
> Next, I need scoring to work.  I read that with wildcard queries all scores 
> are set to 1.0 by default.  But I read you can use the 
> QueryParser.setMultiTermRewriteMethod() method to take scoring into account 
> again.  So I tried:
>
> QueryParser queryParser = new QueryParser(Version.LUCENE_30, "my_field", new 
> StandardAnalyzer(Version.LUCENE_30));
> queryParser.setMultiTermRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE);
> Query prefixQuery = queryParser.parse("t*");
> indexSearcher.search(prefixQuery, collector);
>
> Now, I get the TooManyClauses exception.
>
> I tried adding the PrefixFilter back in but with no luck.  Still get the 
> exception.
>
> Again, sorry if this has been discussed before.  Just not seeing an answer to 
> this after much searching and I just don't understand what is going on here.  
> Any help appreciated.  Links welcome.
>
> Bill
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Wrong documents in results

2013-08-16 Thread Maksym Krasovskiy
Hi!
I have documents with two fields id and name. I create index with code:
Document doc = new Document();
doc.add(new TextField("id", id), Store.YES));
doc.add(new TextField("name", QueryParser.escape(name), Store.YES));
indexWriter.addDocument(doc);

When I try to search with query with code:
QueryParser qp = new QueryParser(LUCENE_VERSION, "name", new 
WhitespaceAnalyzer(LUCENE_VERSION));
getIndexSearcher().search(qp.parse(“id:( 134586 or  134583 )”), 10);
I got only 2 results as expected

But when I try to search with query:
(name:test) and id:( 134586 or  134583 )
I got many results, but I expect only documents with id  =  134586 or  134583  
which have test in name field. Why lucene add to search results additional 
documents which not match search criteria?


--
Krasovskiy Maxim


Re: Wrong documents in results

2013-08-16 Thread Ian Lea
and != AND?  
http://lucene.apache.org/core/4_4_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#AND

It works for or rather than OR because that is the default.  If you
had a doc with id="or" you'd find that too, I think.

It looks odd to be escaping the value when you are storing it.  That
may not be necessary, but if it's what you want, fine.


--
Ian.


On Fri, Aug 16, 2013 at 3:55 PM, Maksym Krasovskiy  wrote:
> Hi!
> I have documents with two fields id and name. I create index with code:
> Document doc = new Document();
> doc.add(new TextField("id", id), Store.YES));
> doc.add(new TextField("name", QueryParser.escape(name), Store.YES));
> indexWriter.addDocument(doc);
>
> When I try to search with query with code:
> QueryParser qp = new QueryParser(LUCENE_VERSION, "name", new 
> WhitespaceAnalyzer(LUCENE_VERSION));
> getIndexSearcher().search(qp.parse(“id:( 134586 or  134583 )”), 10);
> I got only 2 results as expected
>
> But when I try to search with query:
> (name:test) and id:( 134586 or  134583 )
> I got many results, but I expect only documents with id  =  134586 or  134583 
>  which have test in name field. Why lucene add to search results additional 
> documents which not match search criteria?
>
>
> --
> Krasovskiy Maxim

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



SPI class of type org.apache.lucene.codecs.Codec error

2013-08-16 Thread Amal Kammoun
Hi,

We are working on a project which uses Lucene 4.2.1. Actually we are facing
an error message "java.lang.
ExceptionInInitializerError". We are using Maven for assembling the project
and we have a dependency between two projects. When we do the test with
eclipse it works fine. However, when we incorporate our jar in a client
that is tested outside Eclipse we got the
java.lang.ExceptionInInitializerError.
We are doing workaround to overcome the issue since yesterday, we got the
same issue with both versions of Lucene 4.2.0 and 4.2.1.

Have you ever experienced such an issue with maven? Are the newer Lucene
versions safer from such an issue?

Here is the rest of the message error
Caused by: java.lang.
IllegalArgumentException: A SPI class of type
org.apache.lucene.codecs.Codec with name 'Lucene42' does not exist. You
need to add the corresponding JAR file supporting this SPI to your
classpath.The current classpath supports the following names.

Thank you a lot in advance for your support.
Best regards,


RE: SPI class of type org.apache.lucene.codecs.Codec error

2013-08-16 Thread Uwe Schindler
Hi,

Maven makes it even simplier to handle this! The problem may be (I am not sure 
not sure, because I don’t know your setup):
It seems that you are using the Maven Shade Plugin to merge all JAR files into 
one BIG JAR file. During this step, you may be missing to merge all the data 
correctly in your JAR files. Lucene JARs also contain metadata and other 
resources (in addition to class files) in the META-INF folders and those are 
generally not always merged by all those tools, so those must be copied and 
merged if multiple META-INF/services with same name exist. The 
Maven-Shade-Plugin can do this for you, see:

http://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html
Especially: 
http://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ServicesResourceTransformer

It is recommended to use the ServicesResourceTransformer option.

Ideally, you should not change or transform JAR files of Lucene at all and not 
merge them, just ship them with your project as is. Please keep them separate, 
only for special use cases like autostarting double-click JAR files, merge them 
otherwise management gets crazy.

In any case, please check your classpath:
- Are the *unmodified* lucene-core.jar files in it?
- Don't use crazy classloader hierarchies. Keep all Lucene code together in one 
classloader (so don't place Lucene JAR files outside your webapp, but the code 
using lucene inside a webapp).
- If you create uber-JARS (which is a bad idea in general), use Maven-Shade 
plugin and configure it correctly. The Uber-JAR file must contain a 
"META-INF/services" folder with some org.apache.lucene.index.* files.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Amal Kammoun [mailto:kammoun.ama...@gmail.com]
> Sent: Friday, August 16, 2013 5:39 PM
> To: java-user@lucene.apache.org
> Subject: SPI class of type org.apache.lucene.codecs.Codec error
> 
> Hi,
> 
> We are working on a project which uses Lucene 4.2.1. Actually we are facing
> an error message "java.lang.
> ExceptionInInitializerError". We are using Maven for assembling the project
> and we have a dependency between two projects. When we do the test
> with eclipse it works fine. However, when we incorporate our jar in a client
> that is tested outside Eclipse we got the 
> java.lang.ExceptionInInitializerError.
> We are doing workaround to overcome the issue since yesterday, we got the
> same issue with both versions of Lucene 4.2.0 and 4.2.1.
> 
> Have you ever experienced such an issue with maven? Are the newer
> Lucene versions safer from such an issue?
> 
> Here is the rest of the message error
> Caused by: java.lang.
> IllegalArgumentException: A SPI class of type
> org.apache.lucene.codecs.Codec with name 'Lucene42' does not exist. You
> need to add the corresponding JAR file supporting this SPI to your
> classpath.The current classpath supports the following names.
> 
> Thank you a lot in advance for your support.
> Best regards,


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: SPI class of type org.apache.lucene.codecs.Codec error

2013-08-16 Thread Amal Kammoun
Thank you,

We are using Eclipse under Linux, and Java 1.7. Maven Shade is used for
assembling the project (P1) which depends on another project which uses
Lucene (P2).  P2 uses lucene.core, lucene.queryparser,
lucene.analyzercommon.
Please find enclosed a screen-shot of the services of the two Jar.

We use to use previous versions of lucene (2.x) and we perform the same
process of assembling without issue.

Hope this could help.

best regards,
Amal



2013/8/16 Uwe Schindler 

> Hi,
>
> Maven makes it even simplier to handle this! The problem may be (I am not
> sure not sure, because I don’t know your setup):
> It seems that you are using the Maven Shade Plugin to merge all JAR files
> into one BIG JAR file. During this step, you may be missing to merge all
> the data correctly in your JAR files. Lucene JARs also contain metadata and
> other resources (in addition to class files) in the META-INF folders and
> those are generally not always merged by all those tools, so those must be
> copied and merged if multiple META-INF/services with same name exist. The
> Maven-Shade-Plugin can do this for you, see:
>
>
> http://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html
> Especially:
> http://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ServicesResourceTransformer
>
> It is recommended to use the ServicesResourceTransformer option.
>
> Ideally, you should not change or transform JAR files of Lucene at all and
> not merge them, just ship them with your project as is. Please keep them
> separate, only for special use cases like autostarting double-click JAR
> files, merge them otherwise management gets crazy.
>
> In any case, please check your classpath:
> - Are the *unmodified* lucene-core.jar files in it?
> - Don't use crazy classloader hierarchies. Keep all Lucene code together
> in one classloader (so don't place Lucene JAR files outside your webapp,
> but the code using lucene inside a webapp).
> - If you create uber-JARS (which is a bad idea in general), use
> Maven-Shade plugin and configure it correctly. The Uber-JAR file must
> contain a "META-INF/services" folder with some org.apache.lucene.index.*
> files.
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> > -Original Message-
> > From: Amal Kammoun [mailto:kammoun.ama...@gmail.com]
> > Sent: Friday, August 16, 2013 5:39 PM
> > To: java-user@lucene.apache.org
> > Subject: SPI class of type org.apache.lucene.codecs.Codec error
> >
> > Hi,
> >
> > We are working on a project which uses Lucene 4.2.1. Actually we are
> facing
> > an error message "java.lang.
> > ExceptionInInitializerError". We are using Maven for assembling the
> project
> > and we have a dependency between two projects. When we do the test
> > with eclipse it works fine. However, when we incorporate our jar in a
> client
> > that is tested outside Eclipse we got the
> java.lang.ExceptionInInitializerError.
> > We are doing workaround to overcome the issue since yesterday, we got the
> > same issue with both versions of Lucene 4.2.0 and 4.2.1.
> >
> > Have you ever experienced such an issue with maven? Are the newer
> > Lucene versions safer from such an issue?
> >
> > Here is the rest of the message error
> > Caused by: java.lang.
> > IllegalArgumentException: A SPI class of type
> > org.apache.lucene.codecs.Codec with name 'Lucene42' does not exist. You
> > need to add the corresponding JAR file supporting this SPI to your
> > classpath.The current classpath supports the following names.
> >
> > Thank you a lot in advance for your support.
> > Best regards,
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: SPI class of type org.apache.lucene.codecs.Codec error

2013-08-16 Thread Uwe Schindler
Hi,

 

There is no screen shot attached to your mail. Please put it somewhere in the 
web and send a link.

 

Uwe

 

-

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

  http://www.thetaphi.de

eMail: u...@thetaphi.de

 

From: Amal Kammoun [mailto:kammoun.ama...@gmail.com] 
Sent: Friday, August 16, 2013 6:08 PM
To: java-user@lucene.apache.org
Subject: Re: SPI class of type org.apache.lucene.codecs.Codec error

 

Thank you,

We are using Eclipse under Linux, and Java 1.7. Maven Shade is used for 
assembling the project (P1) which depends on another project which uses Lucene 
(P2).  P2 uses lucene.core, lucene.queryparser, lucene.analyzercommon.

Please find enclosed a screen-shot of the services of the two Jar.

We use to use previous versions of lucene (2.x) and we perform the same process 
of assembling without issue.

Hope this could help.

best regards,

Amal

 

 

2013/8/16 Uwe Schindler 

Hi,

Maven makes it even simplier to handle this! The problem may be (I am not sure 
not sure, because I don’t know your setup):
It seems that you are using the Maven Shade Plugin to merge all JAR files into 
one BIG JAR file. During this step, you may be missing to merge all the data 
correctly in your JAR files. Lucene JARs also contain metadata and other 
resources (in addition to class files) in the META-INF folders and those are 
generally not always merged by all those tools, so those must be copied and 
merged if multiple META-INF/services with same name exist. The 
Maven-Shade-Plugin can do this for you, see:

http://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html
Especially: 
http://maven.apache.org/plugins/maven-shade-plugin/examples/resource-transformers.html#ServicesResourceTransformer

It is recommended to use the ServicesResourceTransformer option.

Ideally, you should not change or transform JAR files of Lucene at all and not 
merge them, just ship them with your project as is. Please keep them separate, 
only for special use cases like autostarting double-click JAR files, merge them 
otherwise management gets crazy.

In any case, please check your classpath:
- Are the *unmodified* lucene-core.jar files in it?
- Don't use crazy classloader hierarchies. Keep all Lucene code together in one 
classloader (so don't place Lucene JAR files outside your webapp, but the code 
using lucene inside a webapp).
- If you create uber-JARS (which is a bad idea in general), use Maven-Shade 
plugin and configure it correctly. The Uber-JAR file must contain a 
"META-INF/services" folder with some org.apache.lucene.index.* files.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de



> -Original Message-
> From: Amal Kammoun [mailto:kammoun.ama...@gmail.com]
> Sent: Friday, August 16, 2013 5:39 PM
> To: java-user@lucene.apache.org
> Subject: SPI class of type org.apache.lucene.codecs.Codec error
>
> Hi,
>
> We are working on a project which uses Lucene 4.2.1. Actually we are facing
> an error message "java.lang.
> ExceptionInInitializerError". We are using Maven for assembling the project
> and we have a dependency between two projects. When we do the test
> with eclipse it works fine. However, when we incorporate our jar in a client
> that is tested outside Eclipse we got the 
> java.lang.ExceptionInInitializerError.
> We are doing workaround to overcome the issue since yesterday, we got the
> same issue with both versions of Lucene 4.2.0 and 4.2.1.
>
> Have you ever experienced such an issue with maven? Are the newer
> Lucene versions safer from such an issue?
>
> Here is the rest of the message error
> Caused by: java.lang.
> IllegalArgumentException: A SPI class of type
> org.apache.lucene.codecs.Codec with name 'Lucene42' does not exist. You
> need to add the corresponding JAR file supporting this SPI to your
> classpath.The current classpath supports the following names.
>
> Thank you a lot in advance for your support.
> Best regards,



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org