LowerCaseFilter should be able to be configured to use a specific locale. -------------------------------------------------------------------------
Key: LUCENE-1581 URL: https://issues.apache.org/jira/browse/LUCENE-1581 Project: Lucene - Java Issue Type: Improvement Reporter: Digy //Since I am a .Net programmer, Sample codes will be in c# but I don't think that it would be a problem to understand them. // Assume an input text like "İ" and and analyzer like below {code} public class SomeAnalyzer : Analyzer { public override TokenStream TokenStream(string fieldName, System.IO.TextReader reader) { TokenStream t = new SomeTokenizer(reader); t = new Lucene.Net.Analysis.ASCIIFoldingFilter(t); t = new LowerCaseFilter(t); return t; } } {code} ASCIIFoldingFilter will return "I" and after, LowerCaseFilter will return "i" (if locale is "en-US") or "ı' if(locale is "tr-TR") (that means,this token should be input to another instance of ASCIIFoldingFilter) So, calling LowerCaseFilter before ASCIIFoldingFilter would be a solution, but a better approach can be adding a new constructor to LowerCaseFilter and forcing it to use a specific locale. {code} public sealed class LowerCaseFilter : TokenFilter { /* +++ */System.Globalization.CultureInfo CultureInfo = System.Globalization.CultureInfo.CurrentCulture; public LowerCaseFilter(TokenStream in) : base(in) { } /* +++ */ public LowerCaseFilter(TokenStream in, System.Globalization.CultureInfo CultureInfo) : base(in) /* +++ */ { /* +++ */ this.CultureInfo = CultureInfo; /* +++ */ } public override Token Next(Token result) { result = Input.Next(result); if (result != null) { char[] buffer = result.TermBuffer(); int length = result.termLength; for (int i = 0; i < length; i++) /* +++ */ buffer[i] = System.Char.ToLower(buffer[i],CultureInfo); return result; } else return null; } } {code} DIGY -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org