On Thu, 10 May 2012 17:54:35 -0700, Darin McBeath <[email protected]> wrote:
> It of course depends on your index settings, but I seem to recall the > following from some investigations 5 or 6 years back (of course some things > could have changed since then). > > * Trailing wildcard queries: search for a word with a trailing > multi‐character wildcard (e.g. “exp*”). You must have at least 3 non‐* > characters before the trailing wildcard. Note: you can run trailing wildcard > searches with less than 3 leading characters by using “?”. A query for the > term “e*” can be written as an or of queries for “e”, “e?”, “e??”, “e??*”. > This re‐ written query will yield an accurate unfiltered search, given a word > lexicon and trailing wildcard search ON. > > > So, I believe you will have problems with *hits*. But, maybe one of the > resident MarkLogic experts could shed some light. > > Darin. > I've been busy, Darin. :) The e* -> e, e?, e??, e??* rewrite is now done automatically. If you turn on 3 character wildcards and positions and put in place a word lexicon, a lot of wildcard queries can be resolved accurately by using lexicon expansion, possibly in combination with reducing the query to the set of unique 3-character prefixes and suffixes to make it more efficient (positions needed here). There are still a few heuristic gaps: since lexicon expansion can be expensive, we try to avoid it if it looks like there will be too many matches. //Mary _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
