On Jun 22, 2005, at 4:01 AM, Morus Walter wrote:
Markus Atteneder writes:
There is a possibility for searching with the "*" and "?" wildcard
at the
end and in the middle of a search string, but not at the
beginning, is there
way to do this?
Sure. Simply index reversed words.
The reason why QP prohibits wildcards at the beginning is performance.
If there is some prefix, only terms using this prefix need to be
examined,
if they match the wildcard.
IIRC you can use wildcards in the beginning if you create the query
using
the api but it will be slow.
So the performant solution is to have an additional field
containing the
tokens in reversed character order.
Won't help for *foo* though.
There is a technique from the book Managing Gigabytes that I've
mentioned here before (in February). Here's a snippet from it:
----
...technique I found in the book Managing Gigabytes, making
"*string*" queries drastically more efficient for searching (though
also impacting index size). Take the term "cat". It would be
indexed with all rotated variations with an end of word marker added:
cat$
at$c
t$ca
$cat
The query for "*at*" would be preprocessed and rotated such that the
wildcards are collapsed at the end to search for "at*" as a
PrefixQuery. A wildcard in the middle of a string like "c*t" would
become a prefix query for "t$c*".
----
Anyone tried this technique with Lucene?
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]