Irina Gorbunova created SOLR-5422:
-------------------------------------
Summary: Support mask for dynamic fields in the language detection
processor
Key: SOLR-5422
URL: https://issues.apache.org/jira/browse/SOLR-5422
Project: Solr
Issue Type: Improvement
Reporter: Irina Gorbunova
h3. User Story
I need to stem multilingual document for indexing.
I have several fields to stem and I use update request processor with
*langid.map.individual.fl*, because I need to define language individually for
every field. I have troubles with multivalued field. There is a field *tag*.
First, I made this field multivalued, because my documents can have several
tags.
But processor didn't define language separately for *tag* values in follow case
{code}
"document" : {
...
"tag" : ["spanish", "español"]
...
}
{code}
So, I changed my schema and made field *tag* dynamic.
{code}
"document" : {
...
"tag_1" : "spanish",
"tag_2" : "español"
...
}
{code}
But language detection processor ignores field like tag_*.
Count of tags isn't limited for the document, so I can't define
*langid.map.individual.fl* like tag_1, tag_2, ..., tag_37, because there can be
tag_38 field in the document.
*So, I think it will be useful improvement if language detection processor
supports definitions like*
{code}
<langid.fl>blah*, *blahblah</langid.fl>
<langid.map.fl>blah*, *blahblah</langid.map.fl>
<langid.map.individual.fl>blah*, *blahblah</langid.map.individual.fl>
{code}
Or if there will be possibility to tell solr : "I want you define language of
my multivalued field separately for every value"
--
This message was sent by Atlassian JIRA
(v6.1#6144)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]