[ https://issues.apache.org/jira/browse/SOLR-13593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904584#comment-16904584 ]
Tomoko Uchida commented on SOLR-13593: -------------------------------------- ICU factory "name" argument was changed to "form" on the master branch, so the factories can be looked up by names (with "form" attributes to specify normalization form) like this: {code:xml} <fieldType name="text_ws_icucf" class="solr.TextField" positionIncrementGap="100"> <analyzer> <charFilter name="icuNormalizer2" form="nfkc"/> <tokenizer name="whitespace"/> </analyzer> </fieldType> <fieldType name="text_ws_icutf" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer name="whitespace"/> <filter name="icuNormalizer2" form="nfkc"/> </analyzer> </fieldType> {code} Corresponding field types using "class" are: {code:xml} <fieldType name="text_ws_icucf" class="solr.TextField" positionIncrementGap="100"> <analyzer> <charFilter class="solr.ICUNormalizer2CharFilterFactory" form="nfkc"/> <tokenizer class="solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType> <fieldType name="text_ws_icutf" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ICUNormalizer2FilterFactory" form="nfkc" mode="compose"/> </analyzer> </fieldType> {code} This works for me and the branch passed entire test. I will merge the all changes to the master branch soon. > Allow to specify analyzer components by their SPI names in schema definition > ---------------------------------------------------------------------------- > > Key: SOLR-13593 > URL: https://issues.apache.org/jira/browse/SOLR-13593 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis > Reporter: Tomoko Uchida > Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Now each analysis factory has explicitely documented SPI name which is stored > in the static "NAME" field (LUCENE-8778). > Solr uses factories' simple class name in schema definition (like > class="solr.WhitespaceTokenizerFactory"), but we should be able to also use > more concise SPI names (like name="whitespace"). > e.g.: > {code:xml} > <fieldtype name="myfieldtype" class="solr.TextField"> > <analyzer> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" > /> > <filter class="solr.PorterStemFilterFactory" /> > </analyzer> > </fieldtype> > {code} > would be > {code:xml} > <fieldtype name="myfieldtype" class="solr.TextField"> > <analyzer> > <tokenizer name="whitespace"/> > <filter name="keywordMarker" protected="protwords.txt" /> > <filter name="porterStem" /> > </analyzer> > </fieldtype> > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org