On Jun 22, 2005, at 4:01 AM, Morus Walter wrote:

Markus Atteneder writes:

There is a possibility for searching with the "*" and "?" wildcard at the end and in the middle of a search string, but not at the beginning, is there
way to do this?


Sure. Simply index reversed words.

The reason why QP prohibits wildcards at the beginning is performance.
If there is some prefix, only terms using this prefix need to be examined,
if they match the wildcard.
IIRC you can use wildcards in the beginning if you create the query using
the api but it will be slow.

So the performant solution is to have an additional field containing the
tokens in reversed character order.
Won't help for *foo* though.

There is a technique from the book Managing Gigabytes that I've mentioned here before (in February). Here's a snippet from it:

----
...technique I found in the book Managing Gigabytes, making "*string*" queries drastically more efficient for searching (though also impacting index size). Take the term "cat". It would be indexed with all rotated variations with an end of word marker added:

    cat$
    at$c
    t$ca
    $cat

The query for "*at*" would be preprocessed and rotated such that the wildcards are collapsed at the end to search for "at*" as a PrefixQuery. A wildcard in the middle of a string like "c*t" would become a prefix query for "t$c*".
----

Anyone tried this technique with Lucene?

    Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to