build a case insensitive index

Thomas Krďż˝mer Thu, 11 Dec 2003 14:05:34 -0800

Hello Lucene Users

i need a document term matrix to initialize a neural network, that i want to use to integrate user feedback in the retrieval process.

until now, i am using a slightly modified class of the IndexHTML example.

how can i create an index of all the terms in a collection without "term" and "Term" being indexed twice?

in the example, a standard analyzer is used, and in the documentation it sais :

Filters StandardTokenizer with StandardFilter, LowerCaseFilter and StopFilter.

So, why do i get double entries for terms in upper- and lower case writing?

Regards.

Thomas


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

build a case insensitive index

Reply via email to