For numeric fields, this will never happen.
For text fields, I could either
 1) just use the first token generated (yuck)
  2) don't run it through the analyzer (v1.0)
  3) run it through an analyzer specific to range and prefix queries (post v1.0)

Since I know the schema, I can pick and choose different methods for
different field types.  Generic lucene isn't as lucky and has to guess
(hence the ugly try-to-parse-as-a-date code).

An example of why option3 may be needed: consider the recently posted
ISOLatinFilter that stripps accents.  If one indexes text:applé, and
it gets indexed as text:apple, then a range query of text:[applé TO
orange] won't find that document.

Of course you just can't run it through the normal analyzer either
since then text:[a to z] probably won't work (a will get stopped out,
etc).  Also, the normal analyzer may expand things into synonyms, etc.

-Yonik


On Apr 5, 2005 3:43 PM, Erik Hatcher <[EMAIL PROTECTED]> wrote:
> 
> On Apr 5, 2005, at 2:49 PM, Yonik Seeley wrote:
> > Just curious.  I plan on overriding the current getRangeQuery() anyway
> > since it currently doesn't run the endpoints through the analyzer.
> 
> What will you do when multiple tokens are returned from the analyzer?
> 
>         Erik

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to