[
https://issues.apache.org/jira/browse/SOLR-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ryan McKinley updated SOLR-1690:
--------------------------------
Description:
Sometimes it is nice to group structured data into a single field.
This (rough) patch, takes JSON input and indexes tokens based on the key values
pairs in the json.
{code:xml|title=schema.xml}
<!-- JSON Field Type -->
<fieldtype name="json" class="solr.TextField" positionIncrementGap="100"
omitNorms="true">
<analyzer type="index">
<tokenizer class="solr.JSONKeyValueTokenizerFactory" keepArray="true"
hierarchicalKey="false"/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.TrimFilterFactory" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldtype>
{code}
Given text:
{code}
{ "hello": "world", "rank":5 }
{code}
indexed as two tokens:
|| term position | 1 | 2 |
|| term text | hello:world | rank:5 |
|| term type | word | word |
|| source start,end | 12,17 | 27,28 |
was:
Sometimes it is nice to group structured data into a single field.
This (rough) patch, takes JSON input and indexes tokens based on the key values
pairs in the json.
For example, the text:
{code}
{ "hello": "world", "rank":5 }
{code}
gets indexed as two tokens:
|| term position | 1 | 2 |
|| term text | hello:world | rank:5 |
|| term type | word | word |
|| source start,end | 12,17 | 27,28 |
> JSONKeyValueTokenizerFactory -- JSON Tokenizer
> ----------------------------------------------
>
> Key: SOLR-1690
> URL: https://issues.apache.org/jira/browse/SOLR-1690
> Project: Solr
> Issue Type: New Feature
> Components: Schema and Analysis
> Reporter: Ryan McKinley
> Priority: Minor
> Attachments: noggit-1.0-A1.jar,
> SOLR-1690-JSONKeyValueTokenizerFactory.patch
>
>
> Sometimes it is nice to group structured data into a single field.
> This (rough) patch, takes JSON input and indexes tokens based on the key
> values pairs in the json.
> {code:xml|title=schema.xml}
> <!-- JSON Field Type -->
> <fieldtype name="json" class="solr.TextField" positionIncrementGap="100"
> omitNorms="true">
> <analyzer type="index">
> <tokenizer class="solr.JSONKeyValueTokenizerFactory" keepArray="true"
> hierarchicalKey="false"/>
> <filter class="solr.TrimFilterFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.KeywordTokenizerFactory"/>
> <filter class="solr.TrimFilterFactory" />
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> </fieldtype>
> {code}
> Given text:
> {code}
> { "hello": "world", "rank":5 }
> {code}
> indexed as two tokens:
> || term position | 1 | 2 |
> || term text | hello:world | rank:5 |
> || term type | word | word |
> || source start,end | 12,17 | 27,28 |
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.