[ 
https://issues.apache.org/jira/browse/LUCENE-233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Busch resolved LUCENE-233.
----------------------------------

    Resolution: Duplicate
      Assignee:     (was: Lucene Developers)

See LUCENE-210.

> [PATCH] analyzer refactoring based on CVS HEAD from 6/21/2004
> -------------------------------------------------------------
>
>                 Key: LUCENE-233
>                 URL: https://issues.apache.org/jira/browse/LUCENE-233
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: CVS Nightly - Specify date in submission
>         Environment: Operating System: All
> Platform: All
>            Reporter: Rasik Pandey
>            Priority: Minor
>         Attachments: analysis.zip
>
>
> Hello,
> As mentioned in previous exchanges, notably with Grant Ingersoll, I added 
> some 
> new classes to the "analysis" package to meet the requirements of the feature 
> request in Bugzilla (http://issues.apache.org/bugzilla/show_bug.cgi?id=28182) 
> and did some refactoring while I was under-the-hood. This is an overview of 
> the hierarchies per my changes:
> -Analyzer
> --CustomAnalyzer (new abstract class largely based on Grant's BaseAnalyzer) --
> AbstractAnalyzer (new abstract class) ---RussianAnalyzer ---GermanAnalyzer ---
> etc.
> -Tokenizer
> --CloneableTokenizer (new abstract class)
> ---StandardTokenizer
> ---CharTokenizer
> ---CJKTokenizer
> ---etc.
> -TokenFilter
> --CloneableTokenFilter (new abstract class) ---AbstractStemFilter (new 
> abstract class) ----RussianStemFilter ----GermanStemFilter ----etc.
> -Stemmer (very simple new interface used in AbstractStemFilter) --
> PorterStemmer --RussianStemmer --etc.
> In the attached zip file there are 3 diff files (core.analysis, 
> sandbox.analysis, and sandbox.analysis.snowball) and a zip containing the new 
> classes for org.apache.lucene.analysis in the lucene core. I tried to 
> minimize 
> the irrelevant code changes (e.g. style, spaces, etc.) in the diffs while 
> conforming to the code formatting guidelines outlined by Otis. I think there 
> were a number of classes in the "analysis" package that didn't conform so 
> these diffs may have a lot of noise as I reformatted those classe with my 
> IDE, 
> sorry :( . If the diffs are too painful then let me know and I'll try to 
> prune 
> them. 
> If there is a TODO list specific to Analyzers, are the below items on that 
> list?
> 1) move German and Russian packages to sandbox (I think this is on the Lucene 
> TODO list)
> 2) Analyzer class renaming such that dynamic configuration could return 
> classes like Analyzer_ru, Analyzer_de, Analyzer_fr, etc. based on the class 
> naming scheme "Analyzer_{Locale.toString}"
> 3) Documentation
> Question, comments, feedback, criticisms are all welcome......
> Regards,
> RBP
> PS - Thanks Grant!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to