[ 
https://issues.apache.org/jira/browse/SOLR-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16400096#comment-16400096
 ] 

Jan Høydahl commented on SOLR-11774:
------------------------------------

This is broken by SOLR-3381 which was introduced in Solr 5.0. The problem is 
that method {{detectLanguage(String text)}} was replaced with 
{{detectLanguage(SolrInputDocument doc)}} but the one place where detection per 
individual field happened was modified from detecting on the text of one field 
to detecting the whole document 
([https://github.com/apache/lucene-solr/blob/03095ce4d20060a1c63570d8a5214e9858693080/solr/contrib/langid/src/java/org/apache/solr/update/processor/LanguageIdentifierUpdateProcessor.java#L243)]
 which means that all fields get the same treatment.

> langid.map.individual won't work with langid.map.keepOrig
> ---------------------------------------------------------
>
>                 Key: SOLR-11774
>                 URL: https://issues.apache.org/jira/browse/SOLR-11774
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: contrib - LangId
>    Affects Versions: 6.5
>            Reporter: Marco Remy
>            Priority: Minor
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Tried to get language detection to work.
> *Setting:*
> {code:xml}
> <processor 
> class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory">
>       <str name="langid.fl">title,author</str>
>       <str name="langid.langsField">detected_languages</str>
>       <str name="langid.whitelist">de,en</str>
>       <str name="langid.fallback">txt</str>
>       <bool name="langid.map">true</bool>
>       <bool name="langid.map.individual">true</bool>
>       <bool name="langid.map.keepOrig">true</bool>
>     </processor>
> {code}
> Main purpose
> * Map fields individually
> * Keep the original field
> But the fields won't be mapped individually. They are mapped to a single 
> detected language. After some hours of investigation i finally found the 
> reason: *The option langid.map.keepOrig breaks the individual mapping 
> function.* Only if it is disabled the fields will be mapped as expected.
> - Regards



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to