On Thu, 10 May 2012 17:54:35 -0700, Darin McBeath <[email protected]> wrote:

> It of course depends on your index settings, but I seem to recall the 
> following from some investigations 5 or 6 years back (of course some things 
> could have changed since then).
>
> * Trailing wildcard queries: search for a word with a trailing 
> multi‐character wildcard (e.g. “exp*”). You must have at least 3 non‐* 
> characters before the trailing wildcard.  Note: you can run trailing wildcard 
> searches with less than 3 leading characters by using “?”. A query for the 
> term “e*” can be written as an or of queries for “e”, “e?”, “e??”, “e??*”. 
> This re‐ written query will yield an accurate unfiltered search, given a word 
> lexicon and trailing wildcard search ON.
>
>
> So, I believe you will have problems with *hits*.  But, maybe one of the 
> resident MarkLogic experts could shed some light.
>
> Darin.
>

I've been busy, Darin. :)

The e* -> e, e?, e??, e??* rewrite is now done automatically.

If you turn on 3 character wildcards and positions and put
in place a word lexicon, a lot of wildcard queries can
be resolved accurately by using lexicon expansion, possibly
in combination with reducing the query to the set of unique
3-character prefixes and suffixes to make it more efficient
(positions needed here).

There are still a few heuristic gaps: since lexicon expansion
can be expensive, we try to avoid it if it looks like there
will be too many matches.

//Mary
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to