[
https://issues.apache.org/jira/browse/SOLR-314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ryan McKinley updated SOLR-314:
-------------------------------
Attachment: SOLR-314-StoreAnalysis.patch
This adds the StoreAnalysisProcessor to the default chain. It is skipped
unless the request includes a parameter "store.analysis=true"
It chooses the field type based on a field param:
f.fieldname.analyze=FieldTypeName
I'm not totally happy with the field names. suggestions?
- - - - -
The one big issue I'm not sure how to deal with is stitching a multi-valued
reqeust into a single TokenStream.
Consider the input
<add> <doc>
<field name="feature">aaa bbb ccc</field>
<field name="feature">bbb ccc ddd</field>
</doc></add>
As is, If the FieldType has a 'RemoveDuplicates' filter, that won't remove the
duplicates between the fields because each input field gets its own Reader
Any ideas for a way around this?
Can I extract the Tokenizer explicitly?
> Store Analyzed token text from an incoming SolrInputDocument
> ------------------------------------------------------------
>
> Key: SOLR-314
> URL: https://issues.apache.org/jira/browse/SOLR-314
> Project: Solr
> Issue Type: New Feature
> Components: update
> Reporter: Ryan McKinley
> Attachments: SOLR-314-StoreAnalysis.patch
>
>
> This is an UpdateRequestProcessor that runs incoming fields through a Field
> Analyzer and stores the output of each token as a field value.
> For Example. If you have a field type defined:
> <fieldType name="text_ws" class="solr.TextField" >
> <analyzer>
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> </analyzer>
> </fieldType>
> And send a request:
> /update?store.analysis=true&f.feature.analysis=text_ws
> <add> <doc>
> <field name="feature">aaa bbb ccc</field>
> </doc></add>
> The returned document will look like:
> <doc>
> <arr name="feature">
> <str>aaa</str>
> <str>bbb</str>
> <str>ccc</str>
> </arr>
> </doc>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.