[
https://issues.apache.org/jira/browse/LUCENE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16848225#comment-16848225
]
Tomoko Uchida commented on LUCENE-8778:
---------------------------------------
I updated the pull request.
* Service lookup is performed on the case-insensitive map keys (as before).
Preserve original names in the auxiliary Set for reference. Also add a check to
make sure that the size of the lookup map and the original name set.
* Restrict characters that can be used in the SPI names: only allow alphabets,
digits, and underscores. (The last one is added for possible future uses.)
* Document about case-insensitive lookup in each Javadoc tag (I took a
screenshot). It's a bit redundant but at least they are not likely to be
overlooked.
!Screenshot from 2019-05-25 23-25-24.png!
I would like to delay allowing "multiple names" or "aliases", because I don't
want to implement a feature this could never be used. If Elasticsearch team or
someone else is interested in using the analysis service loader, I think the
modification is easy and we can work together then.
Can you please review the last changes in the service loader class? Here are
the diff:
[bf6fc2b|https://github.com/apache/lucene-solr/pull/654/commits/bf6fc2b4cc3db2848e2f79cfbb1fa917a834cf06],
[dab1f5a|https://github.com/apache/lucene-solr/pull/654/commits/dab1f5a9a8cd36ead1272ee99ef51200600a3b3b]
> Define analyzer SPI names as static final fields and document the names in
> Javadocs
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-8778
> URL: https://issues.apache.org/jira/browse/LUCENE-8778
> Project: Lucene - Core
> Issue Type: Task
> Components: modules/analysis
> Reporter: Tomoko Uchida
> Priority: Minor
> Attachments: ListAnalysisComponents.java, SPINamesGenerator.java,
> Screenshot from 2019-04-26 02-17-48.png, Screenshot from 2019-05-25
> 23-25-24.png, TestSPINames.java
>
> Time Spent: 3h 10m
> Remaining Estimate: 0h
>
> Each built-in analysis component (factory of tokenizer / char filter / token
> filter) has a SPI name but currently this is not documented anywhere.
> The goals of this issue:
> * Define SPI names as static final field for each analysis component so that
> users can get the component by name (via {{NAME}} static field.) This also
> provides compile time safety.
> * Officially document the SPI names in Javadocs.
> * Add proper source validation rules to ant {{validate-source-patterns}}
> target so that we can make sure that all analysis components have correct
> field definitions and documentation
> and,
> * Lookup SPI names on the new {{NAME}} fields. Instead deriving those from
> class names.
> (Just for quick reference) we now have:
> * *19* Tokenizers ({{TokenizerFactory.availableTokenizers()}})
> * *6* CharFilters ({{CharFilterFactory.availableCharFilters()}})
> * *118* TokenFilters ({{TokenFilterFactory.availableTokenFilters()}})
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]