Yes markrmiller,the order is important then TokenStream result = new StandardTokenizer(reader); result = new StandardFilter(result); result = new StopFilter(result, stoptable); result = new ISOLatin1AccentFilter(result); result = new FrenchStemFilter(result, excltable); result = new LowerCaseFilter(result);
And finaly with ISOLatin1AccentFilter the result is good :) tanks you. Now go the polish search ^^ markrmiller wrote: > > You certainly can - just create your own Analyzer starting with a copy > of the French one you are using. > > Then you just plug in the filter in the order you want it applied: > > result = new ISOLatin1AccentFilter(result); > > You have to decide for yourself where it will come - if you put it > before the stopword step, more stops words might be removed than if it > was after - that type of thing usually comes down to individual > requirements/filter limitations. If your stopword list has diacriticals > and you run the accent filter before applying the stopword list, some > expected stopwords will never be removed...etc. > > > Christophe from paris wrote: >> Actualy in my FrenchAnalyser >> >> i have : >> >> TokenStream result = new StandardTokenizer(reader); >> result = new StandardFilter(result); >> result = new StopFilter(result, stoptable); >> result = new FrenchStemFilter(result, excltable); >> result = new LowerCaseFilter(result); >> >> >> I can use ISOLatin1AccentFilter in this Class for indexing ans search ? >> And it is the case where ? >> >> >> markrmiller wrote: >> >>> Check out org.apache.lucene.analysis.ISOLatin1AccentFilter >>> >>> It will strip diacritics - just be sure to use it at index time and >>> query time to get what you want. Also, you will no longer be able to >>> differentiate between the two in your searching (rarely that important >>> in my opinion, but others certainly disagree). >>> >>> - Mark >>> >>> Christophe from paris wrote: >>> >>>> Hello >>>> >>>> I'm use FrenchAnalyzer for index >>>> >>>> IndexWriter writer = new IndexWriter(pathOfIndex, new FrenchAnalyzer(), >>>> true); >>>> Document = new Document(); >>>> doc.add(new >>>> Field("TXT_CHARACT_VALUE",word.toLowerCase(),Field.Store.YES,Field.Index.TOKENIZED)); >>>> writer.addDocument(doc); >>>> >>>> And search >>>> >>>> IndexReader reader = IndexReader.open(pathOfIndex); >>>> Searcher searcher = new IndexSearcher(reader); >>>> Analyzer analyzer = new FrenchAnalyzer(); >>>> >>>> QueryParser parser = new QueryParser(field, analyzer); >>>> >>>> Query query = parser.parse(motRecherche); >>>> Hits hits = searcher.search(query); >>>> >>>> in my document i have the word "lumiere" and "lumière" >>>> >>>> when i search lumière only document match lumière but "lumiere" is not >>>> return >>>> >>>> and if search "lumiere" the result is lumiere, lumieres >>>> ,lumiére,lumiéres >>>> but not lumière >>>> >>>> for a total match i must search "lumiere OR limière" >>>> but is not the best solution >>>> >>>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> >>> >>> >> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/search-with-accent-not-match-tp18848522p18869247.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]