Re: [lucy-user] Can lucy do substring search?

Marvin Humphrey Tue, 31 Jan 2012 16:35:46 -0800

On Tue, Jan 31, 2012 at 01:19:38PM -0500, Desilets, Alain wrote:
> I was wondering if there was a way to tokenize the string into individual
> characters instead, and whether that is advisable from a performance point
> of view.


You can experiment with changing the 'pattern' argument to RegexTokenizer#new
to be '.' or '\\S'.  It will definitely be worse from a performance
standpoint, as matching a URL will now require a PhraseQuery with one term for
each letter rather than one term for each component matching \w+ in the URL,
and these terms will exist in virtually every document.

Marvin Humphrey

Re: [lucy-user] Can lucy do substring search?

Reply via email to