Colvin Cowie created SOLR-13963:
-----------------------------------

             Summary: JavaBinCodec has concurrent modification of CharrArr 
resulting in corrupt intranode updates
                 Key: SOLR-13963
                 URL: https://issues.apache.org/jira/browse/SOLR-13963
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
    Affects Versions: 8.3
            Reporter: Colvin Cowie


Discussed on the mailing list "Possible data corruption in JavaBinCodec in Solr 
8.3 during distributed update?"

 

In summary, after moving to 8.3 we had a consistent (but non-deterministic) set 
of failing tests where the data being sent in intranode requests was 
_sometimes_ corrupted. For example if the well formed data was
_'fieldName':"this is a long string"_
The error we saw from Solr might be that
unknown field  _+'fieldNamis a long string"+_ 
 
The change that indirectly caused to this issue to materialize was from 
SOLR-13682 which meant that 
org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call 
org.apache.solr.common.SolrInputField.getValue() rather than 
org.apache.solr.common.SolrInputField.getRawValue() as it had before.
 
getRawValue for a string calls 
org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this 
context calls
org.apache.solr.common.util.JavaBinCodec.getStringProvider()

 
JavaBinCodec has a CharArr, _arr_, which is modified in two different 
locations, but only one of which is protected with a synchronized block
 
getStringProvider() synchronizes on _arr_:
[https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966]
 
but  _readStr() doesn't:
[https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930]
 
The two methods are called concurrently, but wheren't prior to SOLR-13682.
 
Adding a synchronized block into _readStr() around the modification of _arr_ 
fixes the problem as far as I can see.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to