[
https://issues.apache.org/jira/browse/LUCENE-7762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950690#comment-15950690
]
Uwe Schindler edited comment on LUCENE-7762 at 3/31/17 11:09 AM:
-----------------------------------------------------------------
I agree with Robert. When implementing CustomAnalyzer my "larger plan" was
already to remove all hardcoded Analyzer "examples" from source code. This
would also reduce the size of the analysis jars and number of classes confusing
users. My idea would be to just have the current Analyzers as static final
"constants" in some "utility" class, one for each language (e.g., lazy
initialized in {{Analyzers.get(Locale.ENGLISH)}} with a Java 8 function lambda
that returns a CustomAnalyzer, {{Locale}} was just an idea, could also be an
enum).
Users who want analyzers with custom stopwords and so on, can use the builder
pattern of CustomAnalyzer. Then it looks like configuring an analyzer in Solr
or Elasticsearch.
was (Author: thetaphi):
I agree with Robert. When implementing CustomAnalyzer my "larger plan" was
already to remove all hardcoded Analyzer "examples" from source code. This
would also reduce the size of the analysis jars and number of classes confusing
users. My idea would be to just have the current Analyzers as static final
"constants" in some "utility" class, one for each language (e.g., lazy
initialized in {{Analyzers.get(Locale.ENGLISH)}} with a Java 8 function lambda
or something similar, {{Locale}} was just an idea, could also be an enum).
Users who want analyzers with custom stopwords and so on, can use the builder
pattern of CustomAnalyzer. Then it looks like configuring an analyzer in Solr
or Elasticsearch.
> Add EnglishAnalyzer.setMaxTokenLength
> -------------------------------------
>
> Key: LUCENE-7762
> URL: https://issues.apache.org/jira/browse/LUCENE-7762
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: master (7.0), 6.6
>
>
> I think EnglishAnalyzer should also let you change the default (255) max
> token length of the StandardTokenizer its invoking.
> I will also fold the javadoc fixes from LUCENE-7760 here.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]