Re: search with accent not match

Mark Miller Wed, 06 Aug 2008 05:28:07 -0700

Check out org.apache.lucene.analysis.ISOLatin1AccentFilter

It will strip diacritics - just be sure to use it at index time andquery time to get what you want. Also, you will no longer be able todifferentiate between the two in your searching (rarely that importantin my opinion, but others certainly disagree).


- Mark

Christophe from paris wrote:

Hello

I'm use FrenchAnalyzer for index

IndexWriter writer = new IndexWriter(pathOfIndex, new FrenchAnalyzer(),
true);
Document = new Document();
doc.add(new
Field("TXT_CHARACT_VALUE",word.toLowerCase(),Field.Store.YES,Field.Index.TOKENIZED));
writer.addDocument(doc);

And search

IndexReader reader = IndexReader.open(pathOfIndex);                     
Searcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new FrenchAnalyzer();                                       
        
QueryParser parser = new QueryParser(field, analyzer);                          
        
Query query = parser.parse(motRecherche);
Hits hits = searcher.search(query);

in my document i have the word "lumiere" and "lumière"

when i search lumière only document match lumière but "lumiere" is not
return

and if search "lumiere" the result is lumiere, lumieres ,lumiére,lumiéres
but not lumière

for a total match i must search "lumiere OR limière"

but is not the best solution



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: search with accent not match

Reply via email to