[ 
https://issues.apache.org/jira/browse/SOLR-9405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418238#comment-15418238
 ] 

Edward Ribeiro commented on SOLR-9405:
--------------------------------------

Hi guys, I *guess* this patch doesn't solve the issue at hand. 

TL;DR: the solution is to declare stateWatchers as {{stateWatchers = 
ConcurrentHashMap.newKeySet();}}  at L#150. That is, no need to modify 
{{getStateWatchers}}.

Please, let me explain.

First, the error is happening here:
{code}
  /* package-private for testing */
 1. Set<CollectionStateWatcher> getStateWatchers(String collection) {
 2.   CollectionWatch watch = collectionWatches.get(collection);
 3.   if (watch == null)
 4.     return null;
 5.   return new HashSet<>(watch.stateWatchers);
 6. }
{code}

That is, {{ZkStateReader.getStateWatchers}} is creating a new {{HashSet}} 
instance by providing a new collection: {{CollectionWatch#stateWatchers}}. As 
we can see in ZkStateReader, {{watch#stateWatchers}} is also a {{HashSet}}.


Okay, If we look into {{HashSet}}/{{AbstractCollection}} source code, we see 
that the constructor seen at line 5 (ABOVE)  basically calls the {{addAll}} 
method passing the collection provided via the constructor. Then {{addAll}} 
basically loops on collection provided including the elements one by one in the 
new collection. See in the stack trace provided in this issue:

{code}
at java.util.HashMap$KeyIterator.next(java.base@9-ea/HashMap.java:1513) // 
RUNNING THE FOR-EACH LOOP
at 
java.util.AbstractCollection.addAll(java.base@9-ea/AbstractCollection.java:351) 
 // DELEGATES TO ADDALL
at java.util.HashSet.<init>(java.base@9-ea/HashSet.java:119)  // THE CONSTRUCTOR
{code}

In a nutshell, what is happening is that while we are populating the new 
{{HashSet}} instance at line 5 of ZkStateReader a new Thread changes 
{{stateWatchers}} concurrently. This throws the 
{{ConcurrentModificationException}}

*The proposed patch doesn't solve the issue because even if 
{{collectionWatches.compute}} is atomic, none of the {{Sets}} in use in 
ZkStateReader is thread-safe.*

I wrote a test program to demonstrate my speculation: 
https://gist.github.com/eribeiro/4141df2d02c62d7370101bc4349cd8c4

Finally, sorry if I misunderstood the problem, and let me know if what I wrote 
made any sense. :) 

Cheers!

> ConcurrentModificationException in ZkStateReader.getStateWatchers
> -----------------------------------------------------------------
>
>                 Key: SOLR-9405
>                 URL: https://issues.apache.org/jira/browse/SOLR-9405
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 6.1
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Alan Woodward
>             Fix For: 6.2, master (7.0)
>
>         Attachments: SOLR-9405.patch
>
>
> Jenkins found this: http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/1432/
> {code}
> Stack Trace:
> java.util.ConcurrentModificationException
>         at 
> __randomizedtesting.SeedInfo.seed([FA459DF725097EFF:A77E52876204E1C1]:0)
>         at 
> java.util.HashMap$HashIterator.nextNode(java.base@9-ea/HashMap.java:1489)
>         at 
> java.util.HashMap$KeyIterator.next(java.base@9-ea/HashMap.java:1513)
>         at 
> java.util.AbstractCollection.addAll(java.base@9-ea/AbstractCollection.java:351)
>         at java.util.HashSet.<init>(java.base@9-ea/HashSet.java:119)
>         at 
> org.apache.solr.common.cloud.ZkStateReader.getStateWatchers(ZkStateReader.java:1279)
>         at 
> org.apache.solr.common.cloud.TestCollectionStateWatchers.testSimpleCollectionWatch(TestCollectionStateWatchers.java:116)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to