[ 
https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980860#comment-16980860
 ] 

Colvin Cowie commented on SOLR-13963:
-------------------------------------

I've attached a patch that fixes it, and I've included a new test that 
reproduces the problem without the fix...

I don't know enough about the way the tests have been done for Solr to know 
what the best way to write a test for this is, so I've just done something that 
worked.

But if there is a better way to do it / different coding style etc, then 
obviously I'm open to it being done differently.

> JavaBinCodec has concurrent modification of CharrArr resulting in corrupt 
> intranode updates
> -------------------------------------------------------------------------------------------
>
>                 Key: SOLR-13963
>                 URL: https://issues.apache.org/jira/browse/SOLR-13963
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 8.3
>            Reporter: Colvin Cowie
>            Priority: Major
>         Attachments: SOLR-13963.patch
>
>
> Discussed on the mailing list "Possible data corruption in JavaBinCodec in 
> Solr 8.3 during distributed update?"
>  
> In summary, after moving to 8.3 we had a consistent (but non-deterministic) 
> set of failing tests where the data being sent in intranode requests was 
> _sometimes_ corrupted. For example if the well formed data was
>  _'fieldName':"this is a long string"_
>  The error we saw from Solr might be that
>  unknown field _+'fieldNamis a long string"+_ 
>   
>  The change that indirectly caused to this issue to materialize was from 
> SOLR-13682 which meant that 
> org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call 
> org.apache.solr.common.SolrInputField.getValue() rather than 
> org.apache.solr.common.SolrInputField.getRawValue() as it had before.
>   
>  getRawValue for a string calls 
> org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this 
> context calls
>  org.apache.solr.common.util.JavaBinCodec.getStringProvider()
>  
>  JavaBinCodec has a CharArr, _arr_, which is modified in two different 
> locations, but only one of which is protected with a synchronized block
>   
>  getStringProvider() synchronizes on _arr_:
>  
> [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966]
>   
>  but  _readStr() doesn't:
>  
> [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930]
>   
>  The two methods are called concurrently, but wheren't prior to SOLR-13682.
>   
>  Adding a synchronized block into _readStr() around the modification of _arr_ 
> fixes the problem as far as I can see.
>  
> Also, the problem does not seem to occur when using the dynamic schema mode 
> of autoCreateFields=true in the updateRequestProcessorChain.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to