> If we change StandardTokenizer in this way then we risk breaking all
> the applications that currently use it and depend on its current
> behaviour.

My personal issue with the StandardTokenizer is that it splits off
single letter prefixes, as in 't-shirt'. A query for 't-shirt' therefore
also returns documents with 't. miller's shirt'. I can't imagine how
this behavior could ever be considered useful or depended upon, but I
may be wrong (perhaps someone has an example where it does make sense).

--
Eric Jain


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to