[
https://issues.apache.org/jira/browse/SOLR-16257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Houston Putman resolved SOLR-16257.
-----------------------------------
Fix Version/s: 9.1
main (10.0)
Assignee: Houston Putman
Resolution: Fixed
> Possible Race condition in ZkStateReader between fields
> watchedCollectionStates and collectionWatches
> -----------------------------------------------------------------------------------------------------
>
> Key: SOLR-16257
> URL: https://issues.apache.org/jira/browse/SOLR-16257
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
> Affects Versions: 8.8, main (10.0)
> Reporter: Patson Luk
> Assignee: Houston Putman
> Priority: Major
> Fix For: 9.1, main (10.0)
>
> Time Spent: 13h 20m
> Remaining Estimate: 0h
>
> We have observed certain weird behavior with our Solr that the cluster state
> for certain collections seem to be "stuck" in certain stale states. While
> inspecting the current code logic, it's found that certain race condition can
> arise and lead to inconsistent states of `collectionWatches` and
> `watchedCollectionStates`
>
> By looking at the current design, it appears that those 2 fields:
> * {{collectionWatches -}} for "watching" ZK updates on collections (via
> notification). Such map has the collection name (ie org) as the key and value
> is {{CollectionWatch}} which holds another set of
> {{{}DocCollectionWatcher{}}}s in this case (ie {{{}ConcurrentHashMap<String,
> CollectionWatch<DocCollectionWatcher>> collectionWatches{}}})
> * {{watchedCollectionStates}} - as {{{}ConcurrentHashMap<String,
> DocCollection>{}}}, which key is collection name and value is the
> {{DocCollection}} value triggered by previous watch event handled by
> {{ZkStateReader$StateWatcher}} (either by the fetch on first watcher
> registration or by notification from ZK on state changes)
> On the design level, such 2 fields should be "in-sync" , ie if a collection
> is no longer tracked in {{collectionWatches}} then it should not have any
> entry in {{watchedCollectionStates}} neither.
>
> However such guarantee is not a strong one as we can see there are various
> code pieces that try to remove entries from {{watchedCollectionStates}} if
> somehow the collection is no longer in {{collectionWatches}} for example in
> [here|https://github.com/fullstorydev/lucene-solr/blob/b8a2e1574d71094fcf81e6d693634b9ced8f4956/solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L2137]
> , in particular this appears to address certain race condition with
> {{{}unregisterCore{}}}. The code that removes registered DocCollectionWatcher
> also tries to ensure both maps are in sync as in
> [here|https://github.com/apache/solr/blob/56842452026fb5f61fbe439a2f86adc08297c69c/solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L1916]
> *The core of the issue appears to be that there's an assumption when the last
> {{DocCollectionWatcher}} is removed from the {{{}CollectionWatch{}}}, both
> the {{watchedCollectionStates}} and {{collectionWatches}} should be purged of
> the watched collection.* Hence the {{clusterState}} should have the
> {{LazyCollectionRef}} instead, which {{DocCollection get(boolean
> allowCached)}} invocation should return the correct {{DocCollection}} state
> on {{{}allowCached=false{}}}. Unfortunately, there could still be possible
> race condition as far as there are 2 separate maps. One possible race
> condition is demonstrated in later section
> h3. PR proposal
> An idea is to Merge {{DocCollectionWater}} into {{CollectionWatch}} so we
> would not run into race condition that the collection key appears in one but
> not the other. Please see the PR here https://github.com/apache/solr/pull/909
> and would love to gather some feedbacks and thoughts!
>
> h3. Steps to reproduce a race condition
> Spin up 2 nodes, only one node should serve the test collection, the other
> node should be made the overseer/leader of the cluster. Debug on the overseer
> node
> # From the IDE, ensure all breakpoints suspend "thread" only, not "all"
> (that's for intelliJ)
> # All breakpoints are in {{ZkStateReader}}. Set breakpoint at line
> [here|https://github.com/apache/solr/blob/56842452026fb5f61fbe439a2f86adc08297c69c/solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L1965],
> put condition {{Thread.currentThread().getName().startsWith("zkCallback")}}
> . This pauses when zk notification comes in and right before it checks
> whether the collection is already in {{watchedCollectionStates}} in method
> {{updateWatchedCollection}}
> # Set breakpoint at
> [here|https://github.com/apache/solr/blob/56842452026fb5f61fbe439a2f86adc08297c69c/solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L1829],
> which throws {{TimeoutException}} after the latch timeout, for example from
> call of (inside {{{}ZkStateWatcher#waitForState{}}})
> # now start debugging the overseer node
> # Issue a split shard request to the overseer. For example
> http://localhost:8983/solr/admin/collections?action=SPLITSHARD&collection=test&shard=shard1
> # Eventually a thread will stop the first breakpoint within
> {{updateWatchedCollection}}, the thread name should be something like
> {{zkCallback-127-thread-2}}
> # Might have to wait for 320 secs (timeout on
> {{CollectionHandlingUtils.waitForNewShard}}), another thread should stop at
> 2nd breakpoint throwing {{TimeoutException}}, the thread name should be
> something similar to {{OverseerThreadFactory...}}
> # Add breakpoint at
> {{removeDocCollectionWatcher}}[here|https://github.com/apache/solr/blob/56842452026fb5f61fbe439a2f86adc08297c69c/solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L1925],
> that removes the entry from {{watchedCollectionStates}} but not yet purge
> the {{collectionWatches}}
> # Resume this {{OverseerThreadFactory}} thread that's going to throw
> {{TimeoutException}}, it should stop the new breakpoint in
> {{removeDocCollectionWatcher}}.
> # Switch back to thread {{zkCallback...}}, resume that thread, it should
> find that the {{watchedCollectionStates}} is empty, hence trying to insert a
> {{CollectionRef}} into it
> # Switch back to {{OverseerThreadFactory}} thread, step a few steps, it will
> purge the {{collectionWatches}} but the {{watchedCollectionStates}} will
> still have one entry in it.
> # If we inspect the {{zkStateReader.clusterState}} at this point, we will
> notice that the collection will have a non lazy CollectionRef for the test
> collection, while the {{collectionWatches}} is empty but
> {{watchedCollectionStates}} would still have the collection state in there
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]