A bit more: On Thursday 06 January 2005 10:22, Paul Elschot wrote: > On Thursday 06 January 2005 02:17, Andrew Cunningham wrote: > > Hi all, > > > > I'm currently doing a query similar to the following: > > > > for w in wordset: > > query = w near (word1 V word2 V word3 ... V word1422); > > perform query > > > > and I am doing this through SpanQuery.getSpans(), iterating through the > > spans and counting > > the matches, which can result in 4782282 matches (essentially I am only > > after the match count). > > The query works but the performance can be somewhat slow; so I am wondering: > > ... > > c) Is there a faster method to what I am doing I should consider? > > Preindexing all word combinations that you're interested in. >
In case you know all the words in advance, you could also index a helper word at the same position as each of those words. This requires a custom analyzer that inserts the helper word in the token stream with a zero position increment. The query then simplifies to: query = w near helperword which would probably speed things up significantly. Regards, Paul Elschot --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]