Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-13 Thread Paul Elschot
On Saturday 13 November 2004 09:16, Sanyi wrote: > > - leave the current implementation, raising an exception; > > - handle the exception and limit the boolean query to the first 1024 > > (or what ever the limit is) terms; > > - select, between the possible terms, only the first 1024 (or what > > e

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-13 Thread Sanyi
> - leave the current implementation, raising an exception; > - handle the exception and limit the boolean query to the first 1024 > (or what ever the limit is) terms; > - select, between the possible terms, only the first 1024 (or what > ever the limit is) more meaningful ones, leaving out all the

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Luke Francl
On Fri, 2004-11-12 at 14:52, Daniel Naber wrote: > There are two different issues: first, reorder the query so that those > terms with less matches appear first, because as soon as the first term > with 0 matches occurs, search stops. There will probably be a > non-so-difficult implementation f

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Daniel Naber
On Friday 12 November 2004 21:28, Luke Francl wrote: > > That's the point: there is no query optimizer in Lucene. > > Would it be possible to write one? I would be very interested in this > feature. There are two different issues: first, reorder the query so that those terms with less matches ap

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Luke Francl
On Thu, 2004-11-11 at 14:48, Daniel Naber wrote: > On Thursday 11 November 2004 20:57, Sanyi wrote: > > > What I'm saying is that there is no reason for the optimizer to expand > > wild* to more than 1024 variations > > That's the point: there is no query optimizer in Lucene. Would it be possibl

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Giulio Cesare Solaroli
Hi all, I am cross-posting my reply also to developer list because I think some of my arguments belong there. I was thinking about extending somehow the PhraseQuery analyzer in order to better handle wild character expansion. Sanyi idea to "optimize" the expansion of the terms to include just th

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Sanyi
> It is normally possible to reduce the numbers of such complaints a lot > by imposing a minimum prefix length I've alread limited it to a minimum of 5 characters (abcde*). I can still easily find (for the first try) situations where it starts to search for minutes. While another 5 char. partial

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Paul Elschot
On Friday 12 November 2004 07:57, Sanyi wrote: > > That's the point: there is no query optimizer in Lucene. > > Sorry, I'm not very much into Lucene's internal Classes, I'm just telling your the viewpoint of a > user. You know my users aren't technicians, so answers like yours won't make them ha

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-11 Thread Sanyi
> That's the point: there is no query optimizer in Lucene. Sorry, I'm not very much into Lucene's internal Classes, I'm just telling your the viewpoint of a user. You know my users aren't technicians, so answers like yours won't make them happy. They will only see that I randomly don't allow the

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-11 Thread Daniel Naber
On Thursday 11 November 2004 20:57, Sanyi wrote: > What I'm saying is that there is no reason for the optimizer to expand > wild* to more than 1024 variations That's the point: there is no query optimizer in Lucene. Regards Daniel -- http://www.danielnaber.de

RE: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-11 Thread Sanyi
Yes, I understand all of this, but I don't want to set it to MaxInt, since it can easily lead to (even accidental) DoS attacks. What I'm saying is that there is no reason for the optimizer to expand wild* to more than 1024 variations when I search for "somerareword AND wild*", since somerareword

RE: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-11 Thread Will Allen
Any wildcard search will automatically expand your query to the number of terms it find in the index that suit the wildcard. For example: wild*, would become wild OR wilderness OR wildman etc for each of the terms that exist in your index. It is because of this, that you quickly reach the 1024