Re: Confused about non-tokenized fields

Erik Hatcher Fri, 27 May 2005 10:47:27 -0700


On May 27, 2005, at 11:22 AM, Max Pfingsthorn wrote:

Hi!
In my application, I index some strings (like filenames)untokenized, meaning via
doc.add(new Field(FIELD,VALUE,false,true,false));
When I later take a look at it with Luke, I still get tokens of thefilenames (like "news" instead of "news-item.xml") in the list ofmost frequent terms. Shouldn't I get only the
complete filenames there??

Perhaps that "news" term is coming from a different field? Are yousure that you're seeing the filename field tokenized? Your usage ofthe field constructor looks fine to me and should not tokenize.

Also, how do I search case-insensitive over this kind of field?

Lucene is case-sensitive. I suggest lowercasing the field beforeindexing, and search using lowercase. This is the simplestsuggestion, but you may need to use some other technique such ashaving different fields (or different indexes) to deal with case-sensitivity issues.


    Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Confused about non-tokenized fields

Reply via email to