[
https://issues.apache.org/jira/browse/SOLR-9405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418238#comment-15418238
]
Edward Ribeiro commented on SOLR-9405:
--------------------------------------
Hi guys, I *guess* this patch doesn't solve the issue at hand.
TL;DR: the solution is to declare stateWatchers as {{stateWatchers =
ConcurrentHashMap.newKeySet();}} at L#150. That is, no need to modify
{{getStateWatchers}}.
Please, let me explain.
First, the error is happening here:
{code}
/* package-private for testing */
1. Set<CollectionStateWatcher> getStateWatchers(String collection) {
2. CollectionWatch watch = collectionWatches.get(collection);
3. if (watch == null)
4. return null;
5. return new HashSet<>(watch.stateWatchers);
6. }
{code}
That is, {{ZkStateReader.getStateWatchers}} is creating a new {{HashSet}}
instance by providing a new collection: {{CollectionWatch#stateWatchers}}. As
we can see in ZkStateReader, {{watch#stateWatchers}} is also a {{HashSet}}.
Okay, If we look into {{HashSet}}/{{AbstractCollection}} source code, we see
that the constructor seen at line 5 (ABOVE) basically calls the {{addAll}}
method passing the collection provided via the constructor. Then {{addAll}}
basically loops on collection provided including the elements one by one in the
new collection. See in the stack trace provided in this issue:
{code}
at java.util.HashMap$KeyIterator.next(java.base@9-ea/HashMap.java:1513) //
RUNNING THE FOR-EACH LOOP
at
java.util.AbstractCollection.addAll(java.base@9-ea/AbstractCollection.java:351)
// DELEGATES TO ADDALL
at java.util.HashSet.<init>(java.base@9-ea/HashSet.java:119) // THE CONSTRUCTOR
{code}
In a nutshell, what is happening is that while we are populating the new
{{HashSet}} instance at line 5 of ZkStateReader a new Thread changes
{{stateWatchers}} concurrently. This throws the
{{ConcurrentModificationException}}
*The proposed patch doesn't solve the issue because even if
{{collectionWatches.compute}} is atomic, none of the {{Sets}} in use in
ZkStateReader is thread-safe.*
I wrote a test program to demonstrate my speculation:
https://gist.github.com/eribeiro/4141df2d02c62d7370101bc4349cd8c4
Finally, sorry if I misunderstood the problem, and let me know if what I wrote
made any sense. :)
Cheers!
> ConcurrentModificationException in ZkStateReader.getStateWatchers
> -----------------------------------------------------------------
>
> Key: SOLR-9405
> URL: https://issues.apache.org/jira/browse/SOLR-9405
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: SolrCloud
> Affects Versions: 6.1
> Reporter: Shalin Shekhar Mangar
> Assignee: Alan Woodward
> Fix For: 6.2, master (7.0)
>
> Attachments: SOLR-9405.patch
>
>
> Jenkins found this: http://jenkins.thetaphi.de/job/Lucene-Solr-6.x-Linux/1432/
> {code}
> Stack Trace:
> java.util.ConcurrentModificationException
> at
> __randomizedtesting.SeedInfo.seed([FA459DF725097EFF:A77E52876204E1C1]:0)
> at
> java.util.HashMap$HashIterator.nextNode(java.base@9-ea/HashMap.java:1489)
> at
> java.util.HashMap$KeyIterator.next(java.base@9-ea/HashMap.java:1513)
> at
> java.util.AbstractCollection.addAll(java.base@9-ea/AbstractCollection.java:351)
> at java.util.HashSet.<init>(java.base@9-ea/HashSet.java:119)
> at
> org.apache.solr.common.cloud.ZkStateReader.getStateWatchers(ZkStateReader.java:1279)
> at
> org.apache.solr.common.cloud.TestCollectionStateWatchers.testSimpleCollectionWatch(TestCollectionStateWatchers.java:116)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]