On May 27, 2005, at 11:22 AM, Max Pfingsthorn wrote:

Hi!

In my application, I index some strings (like filenames) untokenized, meaning via

doc.add(new Field(FIELD,VALUE,false,true,false));

When I later take a look at it with Luke, I still get tokens of the filenames (like "news" instead of "news-item.xml") in the list of most frequent terms. Shouldn't I get only the
complete filenames there??

Perhaps that "news" term is coming from a different field? Are you sure that you're seeing the filename field tokenized? Your usage of the field constructor looks fine to me and should not tokenize.

Also, how do I search case-insensitive over this kind of field?

Lucene is case-sensitive. I suggest lowercasing the field before indexing, and search using lowercase. This is the simplest suggestion, but you may need to use some other technique such as having different fields (or different indexes) to deal with case- sensitivity issues.

    Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to