Re: Span Query Performance

Paul Elschot Thu, 06 Jan 2005 06:04:46 -0800

A bit more:

On Thursday 06 January 2005 10:22, Paul Elschot wrote:
> On Thursday 06 January 2005 02:17, Andrew Cunningham wrote:
> > Hi all,
> > 
> > I'm currently doing a query similar to the following:
> > 
> > for w in wordset:
> >     query = w near (word1 V word2 V word3 ... V word1422);
> >     perform query
> > 
> > and I am doing this through SpanQuery.getSpans(), iterating through the 
> > spans and counting
> > the matches, which can result in 4782282 matches (essentially I am only 
> > after the match count).
> > The query works but the performance can be somewhat slow; so I am 
wondering:
> > 
...
> > c) Is there a faster method to what I am doing I should consider?
> 
> Preindexing all word combinations that you're interested in.
>


In case you know all the words in advance, you could also index a
helper word at the same position as each of those words.
This requires a custom analyzer that inserts the helper word in the
token stream with a zero position increment.
The query then simplifies to:
query = w near helperword
which would probably speed things up significantly.

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Span Query Performance

Reply via email to