[
https://issues.apache.org/jira/browse/LUCENE-233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Busch resolved LUCENE-233.
----------------------------------
Resolution: Duplicate
Assignee: (was: Lucene Developers)
See LUCENE-210.
> [PATCH] analyzer refactoring based on CVS HEAD from 6/21/2004
> -------------------------------------------------------------
>
> Key: LUCENE-233
> URL: https://issues.apache.org/jira/browse/LUCENE-233
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Analysis
> Affects Versions: CVS Nightly - Specify date in submission
> Environment: Operating System: All
> Platform: All
> Reporter: Rasik Pandey
> Priority: Minor
> Attachments: analysis.zip
>
>
> Hello,
> As mentioned in previous exchanges, notably with Grant Ingersoll, I added
> some
> new classes to the "analysis" package to meet the requirements of the feature
> request in Bugzilla (http://issues.apache.org/bugzilla/show_bug.cgi?id=28182)
> and did some refactoring while I was under-the-hood. This is an overview of
> the hierarchies per my changes:
> -Analyzer
> --CustomAnalyzer (new abstract class largely based on Grant's BaseAnalyzer) --
> AbstractAnalyzer (new abstract class) ---RussianAnalyzer ---GermanAnalyzer ---
> etc.
> -Tokenizer
> --CloneableTokenizer (new abstract class)
> ---StandardTokenizer
> ---CharTokenizer
> ---CJKTokenizer
> ---etc.
> -TokenFilter
> --CloneableTokenFilter (new abstract class) ---AbstractStemFilter (new
> abstract class) ----RussianStemFilter ----GermanStemFilter ----etc.
> -Stemmer (very simple new interface used in AbstractStemFilter) --
> PorterStemmer --RussianStemmer --etc.
> In the attached zip file there are 3 diff files (core.analysis,
> sandbox.analysis, and sandbox.analysis.snowball) and a zip containing the new
> classes for org.apache.lucene.analysis in the lucene core. I tried to
> minimize
> the irrelevant code changes (e.g. style, spaces, etc.) in the diffs while
> conforming to the code formatting guidelines outlined by Otis. I think there
> were a number of classes in the "analysis" package that didn't conform so
> these diffs may have a lot of noise as I reformatted those classe with my
> IDE,
> sorry :( . If the diffs are too painful then let me know and I'll try to
> prune
> them.
> If there is a TODO list specific to Analyzers, are the below items on that
> list?
> 1) move German and Russian packages to sandbox (I think this is on the Lucene
> TODO list)
> 2) Analyzer class renaming such that dynamic configuration could return
> classes like Analyzer_ru, Analyzer_de, Analyzer_fr, etc. based on the class
> naming scheme "Analyzer_{Locale.toString}"
> 3) Documentation
> Question, comments, feedback, criticisms are all welcome......
> Regards,
> RBP
> PS - Thanks Grant!
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]