Are you sure you are creating the fields with Field.Index.ANALYZED ? -----Mensaje original----- De: Dora [mailto:[EMAIL PROTECTED] Enviado el: martes, 25 de noviembre de 2008 12:22 p.m. Para: java-user@lucene.apache.org Asunto: Re: Indexing accented characters, then searching by any form
Karl Wettin wrote: > > Try this (dry coded) snippet instead: > > StandardAnalyzer objAnalyzer = new StandardAnalyzer() { > public TokenStream tokenStream(String fieldName, Reader reader) { > return new ISOLatin1AccentFilter(super.tokenStream(fieldName, > reader)); > } > } > I tried this, but it does not work as expected. I am using an utility class with a static method that gives me an analyzer: public static Analyzer getAnalyzer() { StandardAnalyzer objAnalyzer = new StandardAnalyzer() { public TokenStream tokenStream(String fieldName, Reader reader) { return new ISOLatin1AccentFilter(super.tokenStream(fieldName, reader)); } }; return objAnalyzer; } } So when I need the analyzer (for indexing or searching) I perform an UtilityClass.getAnalyzer() call. It works for my query parser: The accent are correctly removed when performing the search. If my index contains "cafe" searching for "café" will find the documents containing "cafe" But when explore my index with Luke I can see that the indexer does not use the ISOLatin1AccentFilter (I tested with a breakpoint in the overriden tokenStream method) and if the document contains "café", the index will contain "café". As a consequence, search on word having accent is not possible: the index contains the accent, while it is removed by the search process. So my index contains "café", but when I search for "café" the filter changes it in "cafe" and it gives no hit... Any clue on why my filter is not used at time of indexation ? -- View this message in context: http://www.nabble.com/Indexing-accented-characters%2C-then-searching-by-any-form-tp15412778p20682548.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]