>From Antonio Gulli <[EMAIL PROTECTED]> on 23 Oct 2003: > >It looks like both parts of the query are executed seperatly and then > >they are merged. If Lucene would be able to execute the query with > >less results (text:go) first and then only check if the second part > >(title:"The Right Way") matches, those queries would be much faster. > > > > > This shoule be standard way to process conjiuntive query. > For instance "Managing Gigabyte" cap 4.3
Perhaps it would be a bit faster, but it also can use much more memory. If the clause with the fewer results still matches a large subset of the collection, then the scores and document numbers of all of these matches must be stored, requiring at least eight bytes per intermediate match. Lucene's existing BooleanQuery algorithm operates using very little memory. And what really is the savings? One still must enumerate all of the TermDocs or TermPositions of the more frequent clause. So all that one saves is the amount of logic in the inner loop. But Lucene already optimizes this by using a combination of a hash-table and bitwise integer operations, so that each step of the inner loop is constant time, and not proportional to the number of clauses in the conjunction. I'd be happy to see an alternate, faster implementation that does not use huge amounts of memory, but, until then, I'm not convinced this is a better appoach. Doug --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]