[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16983677#comment-16983677 ]
Ryan Rockenbaugh commented on SOLR-13963: ----------------------------------------- [~dsmiley] I was getting ready to submit this weekend. I have a very straightforward test case, just trying to refine it so it is easy to document and test. Because it is a performance issue, I was thinking it must just create additional Java Objects (2 extra for each value) and the case we found it in had 5 fields with 30000 integers in each field (Strings also cause the same issue). It's been a while since I've worked with a java profiler so I need to brush up on that. > JavaBinCodec has concurrent modification of CharArr resulting in corrupt > intranode updates > ------------------------------------------------------------------------------------------ > > Key: SOLR-13963 > URL: https://issues.apache.org/jira/browse/SOLR-13963 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Affects Versions: 8.1 > Reporter: Colvin Cowie > Assignee: Noble Paul > Priority: Blocker > Fix For: 8.3.1 > > Attachments: JavaBinCodec.java, SOLR-13963.patch, SOLR-13963.patch > > > Discussed on the mailing list "Possible data corruption in JavaBinCodec in > Solr 8.3 during distributed update?" > > In summary, after moving to 8.3 we had a consistent (but non-deterministic) > set of failing tests where the data being sent in intranode requests was > _sometimes_ corrupted. For example if the well formed data was > _'fieldName':"this is a long string"_ > The error we saw from Solr might be that > unknown field _+'fieldNamis a long string"+_ > > The change that indirectly caused to this issue to materialize was from > SOLR-13682 which meant that > org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call > org.apache.solr.common.SolrInputField.getValue() rather than > org.apache.solr.common.SolrInputField.getRawValue() as it had before. > > getRawValue for a string calls > org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this > context calls > org.apache.solr.common.util.JavaBinCodec.getStringProvider() > > JavaBinCodec has a CharArr, _arr_, which is modified in two different > locations, but only one of which is protected with a synchronized block > > getStringProvider() synchronizes on _arr_: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966] > > but _readStr() doesn't: > > [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930] > > The two methods are called concurrently, but wheren't prior to SOLR-13682. > > Adding a synchronized block into _readStr() around the modification of _arr_ > fixes the problem as far as I can see. > > Also, the problem does not seem to occur when using the dynamic schema mode > of autoCreateFields=true in the updateRequestProcessorChain. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org