If you are constrained in such a way as to not use the French Analyzer
you might instead consider transforming the input as an additional step
at both search/indexing time.
Use something like a regex that looks for é and always replaces it with
e in the index, and at search time. (expand this transformation step as
needed)
You likely also need to store the original word somewhere, so I would
suggest adding a second stored, but unindexed field that stores the
original value of the word, so when you match on your search criteria,
you will also get the original form of the word in your hits object.
Hope this helps,
Matt
egrand thomas wrote:
Dear all,
I'd like my lucene searches to be insensitive to (French) accents. For example, considering a indexed term
"métal", I want to get it when searching for "metal" or "métal" . I use lucene-2.3.2 and
the searches are performed with: IndexSearcher.search(query,filter,sorter), Another filter is already used together
with a "Sort" object. Futrhermore, I cannot use the FrenchAnalyzer as my index does not only contain French
words.
Can anybody help ?
Thanks in advance,
Tom
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org