[
https://issues.apache.org/jira/browse/SOLR-9030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254288#comment-15254288
]
Shalin Shekhar Mangar commented on SOLR-9030:
---------------------------------------------
It exists to ensure that we do not update/overwrite a cluster state if we had
no idea of its previous znode version. Also the default value of znode in a
DocCollection is -1. If left unchecked, ZK will overwrite the value in the
state without the CAS checks that we rely on.
bq. And shouldn't we expect that that can happen and deal with it
appropriately? (A retry or something?)
Yes and it does recover automatically. A BadVersionException will cause the
complete cluster state to be re-fetched from ZK and the operation is retried.
In production environments, the BadVersionException will not be a problem but
the overwriting of state can be.
> The 'downnode' command can trip asserts in ZkStateWriter or cause
> BadVersionException in Overseer
> -------------------------------------------------------------------------------------------------
>
> Key: SOLR-9030
> URL: https://issues.apache.org/jira/browse/SOLR-9030
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
> Reporter: Shalin Shekhar Mangar
> Fix For: master, 6.1
>
>
> While working on SOLR-9014 I came across a strange test failure.
> {code}
> [junit4] ERROR 16.9s |
> AsyncCallRequestStatusResponseTest.testAsyncCallStatusResponse <<<
> [junit4] > Throwable #1:
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an
> uncaught exception in thread: Thread[id=46,
> name=OverseerStateUpdate-95769832112259076-127.0.0.1:51135_z_oeg%2Ft-n_0000000000,
> state=RUNNABLE, group=Overseer state updater.]
> [junit4] > at
> __randomizedtesting.SeedInfo.seed([91F68DA7E10807C3:CBF7E84BCF328A1A]:0)
> [junit4] > Caused by: java.lang.AssertionError
> [junit4] > at
> __randomizedtesting.SeedInfo.seed([91F68DA7E10807C3]:0)
> [junit4] > at
> org.apache.solr.cloud.overseer.ZkStateWriter.writePendingUpdates(ZkStateWriter.java:231)
> [junit4] > at
> org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:240)
> [junit4] > at java.lang.Thread.run(Thread.java:745)
> {code}
> The underlying problem can manifest by tripping the above assert or a
> BadVersionException as well. I found that this was introduced in SOLR-7281
> where a new 'downnode' command was added.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]