Yes markrmiller,the order is important
then
TokenStream result = new StandardTokenizer(reader);
result = new StandardFilter(result);
result = new StopFilter(result, stoptable);
result = new ISOLatin1AccentFilter(result);
result = new FrenchStemFilter(result, excltable);
result = new LowerCaseFilter(result);
And finaly with ISOLatin1AccentFilter the result is good :)
tanks you.
Now go the polish search ^^
markrmiller wrote:
>
> You certainly can - just create your own Analyzer starting with a copy
> of the French one you are using.
>
> Then you just plug in the filter in the order you want it applied:
>
> result = new ISOLatin1AccentFilter(result);
>
> You have to decide for yourself where it will come - if you put it
> before the stopword step, more stops words might be removed than if it
> was after - that type of thing usually comes down to individual
> requirements/filter limitations. If your stopword list has diacriticals
> and you run the accent filter before applying the stopword list, some
> expected stopwords will never be removed...etc.
>
>
> Christophe from paris wrote:
>> Actualy in my FrenchAnalyser
>>
>> i have :
>>
>> TokenStream result = new StandardTokenizer(reader);
>> result = new StandardFilter(result);
>> result = new StopFilter(result, stoptable);
>> result = new FrenchStemFilter(result, excltable);
>> result = new LowerCaseFilter(result);
>>
>>
>> I can use ISOLatin1AccentFilter in this Class for indexing ans search ?
>> And it is the case where ?
>>
>>
>> markrmiller wrote:
>>
>>> Check out org.apache.lucene.analysis.ISOLatin1AccentFilter
>>>
>>> It will strip diacritics - just be sure to use it at index time and
>>> query time to get what you want. Also, you will no longer be able to
>>> differentiate between the two in your searching (rarely that important
>>> in my opinion, but others certainly disagree).
>>>
>>> - Mark
>>>
>>> Christophe from paris wrote:
>>>
>>>> Hello
>>>>
>>>> I'm use FrenchAnalyzer for index
>>>>
>>>> IndexWriter writer = new IndexWriter(pathOfIndex, new FrenchAnalyzer(),
>>>> true);
>>>> Document = new Document();
>>>> doc.add(new
>>>> Field("TXT_CHARACT_VALUE",word.toLowerCase(),Field.Store.YES,Field.Index.TOKENIZED));
>>>> writer.addDocument(doc);
>>>>
>>>> And search
>>>>
>>>> IndexReader reader = IndexReader.open(pathOfIndex);
>>>> Searcher searcher = new IndexSearcher(reader);
>>>> Analyzer analyzer = new FrenchAnalyzer();
>>>>
>>>> QueryParser parser = new QueryParser(field, analyzer);
>>>>
>>>> Query query = parser.parse(motRecherche);
>>>> Hits hits = searcher.search(query);
>>>>
>>>> in my document i have the word "lumiere" and "lumière"
>>>>
>>>> when i search lumière only document match lumière but "lumiere" is not
>>>> return
>>>>
>>>> and if search "lumiere" the result is lumiere, lumieres
>>>> ,lumiére,lumiéres
>>>> but not lumière
>>>>
>>>> for a total match i must search "lumiere OR limière"
>>>> but is not the best solution
>>>>
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>
>>>
>>>
>>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
--
View this message in context:
http://www.nabble.com/search-with-accent-not-match-tp18848522p18869247.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]