You certainly can - just create your own Analyzer starting with a copy
of the French one you are using.
Then you just plug in the filter in the order you want it applied:
result = new ISOLatin1AccentFilter(result);
You have to decide for yourself where it will come - if you put it
before the stopword step, more stops words might be removed than if it
was after - that type of thing usually comes down to individual
requirements/filter limitations. If your stopword list has diacriticals
and you run the accent filter before applying the stopword list, some
expected stopwords will never be removed...etc.
Christophe from paris wrote:
Actualy in my FrenchAnalyser
i have :
TokenStream result = new StandardTokenizer(reader);
result = new StandardFilter(result);
result = new StopFilter(result, stoptable);
result = new FrenchStemFilter(result, excltable);
result = new LowerCaseFilter(result);
I can use ISOLatin1AccentFilter in this Class for indexing ans search ?
And it is the case where ?
markrmiller wrote:
Check out org.apache.lucene.analysis.ISOLatin1AccentFilter
It will strip diacritics - just be sure to use it at index time and
query time to get what you want. Also, you will no longer be able to
differentiate between the two in your searching (rarely that important
in my opinion, but others certainly disagree).
- Mark
Christophe from paris wrote:
Hello
I'm use FrenchAnalyzer for index
IndexWriter writer = new IndexWriter(pathOfIndex, new FrenchAnalyzer(),
true);
Document = new Document();
doc.add(new
Field("TXT_CHARACT_VALUE",word.toLowerCase(),Field.Store.YES,Field.Index.TOKENIZED));
writer.addDocument(doc);
And search
IndexReader reader = IndexReader.open(pathOfIndex);
Searcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new FrenchAnalyzer();
QueryParser parser = new QueryParser(field, analyzer);
Query query = parser.parse(motRecherche);
Hits hits = searcher.search(query);
in my document i have the word "lumiere" and "lumière"
when i search lumière only document match lumière but "lumiere" is not
return
and if search "lumiere" the result is lumiere, lumieres ,lumiére,lumiéres
but not lumière
for a total match i must search "lumiere OR limière"
but is not the best solution
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]