DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://issues.apache.org/bugzilla/show_bug.cgi?id=28182>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=28182 [PATCH] Never write an Analyzer again Summary: [PATCH] Never write an Analyzer again Product: Lucene Version: CVS Nightly - Specify date in submission Platform: Other OS/Version: Other Status: NEW Severity: Enhancement Priority: Other Component: Analysis AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Hi All, I got sick of writing Analyzers, so I have re-worked some of the Analyzer and Filter code by making the TokenStream an interface (and Tokenizer and TokenFilter). I then created a BaseAnalyzer class that you set a tokenizer on and you set a list of TokenFilters. The tokenStream() method then applies the tokenizer and then loops over the list of TokenFilters, applying each one in order and returning the last one, just as I am sure you have done many a time before. One requirement for this to work is that the Filters and Tokenizers must allow any state information to be re-initialized through the init() method on TokenStream. Also created AbstractTokenizer and AbstractTokenFilter which are trivial implementations of Tokenizer and TokenFilter respectively. I have made all existing tokenizers and filters backwards compatible. Let me know if you like or dislike and what changes you would like me to make. I ran all regression tests and they all worked. I also wrote a TestBaseAnalyzer to test my new Analyzer. See the Test for usage of the Analyzer. I haven't done a full scale indexing test on it yet, but will soon. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]