>From Antonio Gulli <[EMAIL PROTECTED]> on 23 Oct 2003:
> >It looks like both parts of the query are executed seperatly and then
> >they are merged. If Lucene would be able to execute the query with
> >less results (text:go) first and then only check if the second part
> >(title:"The Right Way") matches, those queries would be much faster.
> >  
> >
> This shoule be  standard way to process conjiuntive query.
> For instance "Managing Gigabyte" cap 4.3

Perhaps it would be a bit faster, but it also can use much more memory.  If the clause 
with the fewer results still matches a large subset of the collection, then the scores 
and document numbers of all of these matches must be stored, requiring at least eight 
bytes per intermediate match.  Lucene's existing BooleanQuery algorithm operates using 
very little memory.

And what really is the savings?  One still must enumerate all of the TermDocs or 
TermPositions of the more frequent clause.  So all that one saves is the amount of 
logic in the inner loop.  But Lucene already optimizes this by using a combination of 
a hash-table and bitwise integer operations, so that each step of the inner loop is 
constant time, and not proportional to the number of clauses in the conjunction.

I'd be happy to see an alternate, faster implementation that does not use huge amounts 
of memory, but, until then, I'm not convinced this is a better appoach.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to