[jira] [Comment Edited] (SOLR-13255) LanguageIdentifierUpdateProcessor broken for documents sent with SolrJ/javabin
[ https://issues.apache.org/jira/browse/SOLR-13255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16771824#comment-16771824 ] Noble Paul edited comment on SOLR-13255 at 2/19/19 11:12 AM: - Yes, this is a blocker for 8.0. There is a regression which makes URPs fail. was (Author: noble.paul): Yes, this is a blocker for 8.0 > LanguageIdentifierUpdateProcessor broken for documents sent with SolrJ/javabin > -- > > Key: SOLR-13255 > URL: https://issues.apache.org/jira/browse/SOLR-13255 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - LangId >Affects Versions: 7.7 >Reporter: Andreas Hubold >Assignee: Noble Paul >Priority: Blocker > Fix For: 8.0, 7.7.1 > > Attachments: SOLR-13255.patch, SOLR-13255.patch > > > 7.7 changed the object type of string field values that are passed to > UpdateRequestProcessor implementations from java.lang.String to > ByteArrayUtf8CharSequence. SOLR-12992 was mentioned on solr-user as cause. > The LangDetectLanguageIdentifierUpdateProcessor still expects String values, > does not work for CharSequences, and logs warnings instead. For example: > {noformat} > 2019-02-14 13:14:47.537 WARN (qtp802600647-19) [ x:studio] > o.a.s.u.p.LangDetectLanguageIdentifierUpdateProcessor Field name_tokenized > not a String value, not including in detection > {noformat} > I'm not sure, but there could be further places where the changed type for > string values needs to be handled. (Our custom UpdateRequestProcessor are > broken as well since 7.7 and it would be great to have a proper upgrade note > as part of the release notes) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13255) LanguageIdentifierUpdateProcessor broken for documents sent with SolrJ/javabin
[ https://issues.apache.org/jira/browse/SOLR-13255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16771196#comment-16771196 ] Jason Gerlowski edited comment on SOLR-13255 at 2/18/19 4:24 PM: - bq. it would be great to have a proper upgrade note as part of the release notes Hey [~ahubold], I'm working on "Upgrade Notes" for the next release of our ref-guide, and I wanted them to include this issue. I included a short paragraph over on SOLR-13256. Since you mentioned you were interested in seeing this get documented, I wanted to give you a heads up. Feel free to chime in over there about anything I got wrong or any suggestions you might have. was (Author: gerlowskija): bq. it would be great to have a proper upgrade note as part of the release notes Hey [~ahubold], I'm working on "Upgrade Notes" for users for the next release of our ref-guide, and I wanted them to include this issue. I included a short paragraph over on SOLR-13256. Since you mentioned you were interested in seeing this get documented, I wanted to give you a heads up. Feel free to chime in over there about anything I got wrong or any suggestions you might have. > LanguageIdentifierUpdateProcessor broken for documents sent with SolrJ/javabin > -- > > Key: SOLR-13255 > URL: https://issues.apache.org/jira/browse/SOLR-13255 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - LangId >Affects Versions: 7.7 >Reporter: Andreas Hubold >Priority: Major > Fix For: 8.0, 7.7.1 > > Attachments: SOLR-13255.patch > > > 7.7 changed the object type of string field values that are passed to > UpdateRequestProcessor implementations from java.lang.String to > ByteArrayUtf8CharSequence. SOLR-12992 was mentioned on solr-user as cause. > The LangDetectLanguageIdentifierUpdateProcessor still expects String values, > does not work for CharSequences, and logs warnings instead. For example: > {noformat} > 2019-02-14 13:14:47.537 WARN (qtp802600647-19) [ x:studio] > o.a.s.u.p.LangDetectLanguageIdentifierUpdateProcessor Field name_tokenized > not a String value, not including in detection > {noformat} > I'm not sure, but there could be further places where the changed type for > string values needs to be handled. (Our custom UpdateRequestProcessor are > broken as well since 7.7 and it would be great to have a proper upgrade note > as part of the release notes) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13255) LanguageIdentifierUpdateProcessor broken for documents sent with SolrJ/javabin
[ https://issues.apache.org/jira/browse/SOLR-13255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769161#comment-16769161 ] Jan Høydahl edited comment on SOLR-13255 at 2/15/19 10:12 AM: -- Attached a raw, not tested patch for langid for branch_7_7. Due to a refactor, the bug will be different in 8.x, probably it will just silently fail to detect any languages, since the list of String fields are determined through instanceof String. The patch for 8.x and master will thus need to fix SolrInputDocumentReader instead. I think that for 8.0 we should add an UPGRADE NOTE about this breaking change... was (Author: janhoy): Attached a raw, not tested patch for langid for branch_7_7. Due to a refactor, the patch will be different for master and 8x, where we'll need to fix SolrInputDocumentReader instead, which also does instanced String. I think that for 8.0 we should add an UPGRADE NOTE about this breaking change... > LanguageIdentifierUpdateProcessor broken for documents sent with SolrJ/javabin > -- > > Key: SOLR-13255 > URL: https://issues.apache.org/jira/browse/SOLR-13255 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - LangId >Affects Versions: 7.7 >Reporter: Andreas Hubold >Priority: Major > Fix For: 8.0, 7.7.1 > > Attachments: SOLR-13255.patch > > > 7.7 changed the object type of string field values that are passed to > UpdateRequestProcessor implementations from java.lang.String to > ByteArrayUtf8CharSequence. SOLR-12992 was mentioned on solr-user as cause. > The LangDetectLanguageIdentifierUpdateProcessor still expects String values, > does not work for CharSequences, and logs warnings instead. For example: > {noformat} > 2019-02-14 13:14:47.537 WARN (qtp802600647-19) [ x:studio] > o.a.s.u.p.LangDetectLanguageIdentifierUpdateProcessor Field name_tokenized > not a String value, not including in detection > {noformat} > I'm not sure, but there could be further places where the changed type for > string values needs to be handled. (Our custom UpdateRequestProcessor are > broken as well since 7.7 and it would be great to have a proper upgrade note > as part of the release notes) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org