Jeff Munson wrote:
Single word searches return pretty fast, but when I try phrases,
searching seems to slow considerably. [ ... ]

However, if I use this query, contents:"all parts including picture tube
guaranteed", it returns hits in 2890 millseconds. Other phrases take
longer as well.

You could use an analyzer that inserts bigrams for common terms. Nutch does this. So, if you declare that "all" and "including" are common terms, then this could be tokenized as the following tokens:


0 - all all.parts
1 - parts parts.including
2 - including including.picture
3 - picture
4 - tube
5 - guaranteed

Two tokens at a position indicate where the second has position increment of zero.

Then your phrase search could be converted to:

  "all.parts parts.including including.picture picture tube guaranteed"

which should be much faster, since it has replaced common terms with rare terms.

This approach does make the index larger, and hence makes indexing somewhat slower. So you don't want to declare too many words as common, but a handful can make a big difference if they're used frequently in queries.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to