Hi Tate,
Forgot to ask, what version of Lucene? (IIRC, <= 1.2, means no
maxClauseCount)
sv
On Thu, 29 Apr 2004, Tate Avery wrote:
> Thank you for the response.
>
> I am not using the QueryParser directly... it was just part of my overall
> understanding of how this exception is coming about. Same thing,
> essentially, with the maxClauseCount.
>
>
> Here is some code to illustrate what is confusing me and what I am trying to
> ascertain:
>
> int _numClauses = XXX;
> boolean _required = XXX; // 3 examples of these var settings below
>
> BooleanQuery _query = new BooleanQuery();
>
> for (int _i = 0; _i < _numClauses; _i++)
> {
> _query.add(
> new BooleanClause(
> new TermQuery(new Term("body", "term" + _i)),
> _required,
> false));
> }
>
> Hits _hits = new IndexSearcher(INDEX_DIR).search(_query);
>
>
> 1) With _numClauses=9999 and _required=false (for example), I have no
> problems.
> (This is confusing since 9999 is more than maxClauseCount... but I won't
> complain).
>
> 2) With _numClauses=32 and _required=true, I also have no problems.
>
> 3) With _numClauses=33 and _required=true, I get
> "java.lang.IndexOutOfBoundsException: More than 32 required/prohibited
> clauses in query." as a runtime exception.
>
>
> So, I guess I am trying to ask the following:
>
> Is a query like (T1 AND T2 AND ... AND T32 AND T33) just completely illegal
> for Lucene?
> OR is there some way to extend this limit?
> OR am I missing something that is clouding my understanding?
>
>
>
> Thanks,
> Tate
>
>
>
> -----Original Message-----
> From: Stephane James Vaucher [mailto:[EMAIL PROTECTED]
> Sent: Thursday, April 29, 2004 1:10 PM
> To: Lucene Users List; [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: Re: Understanding Boolean Queries
>
>
> On Thu, 29 Apr 2004, Tate Avery wrote:
>
> > Hello,
> >
> > I have been reviewing some of the code related to boolean queries and I
> > wanted to see if my understanding is approximately correct regarding how
> > they are handled and, more importantly, the limitations.
>
> You can always submit requests for enhancements in bugzilla, so as to keep
> track this issue.
>
> > Here is what I have come to understand so far:
> >
> > 1) The QueryParser code generated from javacc will parse my boolean query
> > and determine for each clause whether or not is 'required' (based on a few
> > conditions, but, in short, whether or not it was introduced or followed by
> > 'AND') or 'prohibited' (based, in short, on it being preceded by 'NOT').
>
> Your usage seems pretty particular, why are you using the javacc
> QueryParser?
>
> > 2) As my BooleanQuery is being constructed, it will throw a
> > BooleanQuery.TooManyClauses exception if I exceed
> > BooleanQuery.maxClauseCount (which defaults to 1024).
>
> It's configurable through sys properties or by
> BooleanQuery.setMaxClauseCount(int maxClauseCount)
> >
> > 3) The maxClauseCount threshold appears not to care whether or not my
> > clauses are 'required' or 'prohibited'... only how many of them there are
> in
> > total.
> >
> > 4) My BooleanQuery will prepare its own Scorer instance (i.e.
> > BooleanScorer). And, during this step, it will identify to the scorer
> which
> > clauses are 'required' or 'prohibited'. And, if more than 32 fall into
> this
> > category, a IndexOutOfBoundsException ("More than 32 required/prohibited
> > clauses in query.") is thrown.
> > That's as far as I got.
> > Now, I am a bit confused at this point. Does this mean I can make a
> boolean
> > query consisting of up to 1024 clauses as long as no more than 32 of them
> > are required or prohibited? This doesn't seem right. So, am I missing
> > something in the way I am understanding this.
> > I am (as you may have guessed) generating large boolean queries. And, in
> > some rare cases, I am receiving the exception identified in #4 (above).
> So,
> > I am trying to figure out whether or not I need to change/filter my
> queries
> > in a special way in order to avoid this exception. And, in order to do
> > this, I want to understand how these queries are being handled.
> > Finally, is there something related to the query syntax that could be my
> > mistake? For example, what is the difference between:
> > "A B" AND "C D" AND "D E"
> > ... and...
> > ("A B") AND ("C D") AND ("D E")
> > ... could that be the crux of it?
>
> I can't help you here, and the doc seems rather thin (or nonexistent for
> this class). I don't know the relation between the query and how the
> scorer will process it.
>
> Sorry I can't be of assistance,
> sv
>
> > Thank you for your time,
> > Tate Avery
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]