30 apr 2007 kl. 02.05 skrev Kun Hong:

I'm not sure if you mean that it should treat all repetative tokens as only one token? Then you are better of using a filter when analyzing text you insert to the index: rather than creating one token for each the in "the the the the the the" you only create one. You might also want to use this filter when parsing user queries. (It will be hard to find the band 'the the'.)
I can't just use filters because I have to cater for other titles that are not just stop words, which
should be analyzed normally. (I know this requirement is a bit fussy).

You might want to consider using two fields. One with "normal analysis" and one that filters out everything by titles that contain nothing but one word repeating over and over again. Then it would be sufficient with a single TermQuery to match, and no more need to hack in the extra start- and stop tokens.



--
karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to