[
https://issues.apache.org/jira/browse/CASSANDRA-16545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315860#comment-17315860
]
Yifan Cai edited comment on CASSANDRA-16545 at 4/6/21, 9:56 PM:
----------------------------------------------------------------
PR: https://github.com/apache/cassandra/pull/954
CI:
https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=CASSANDRA-16545%2Ftrunk
The patch is largely a refactor to pass the same {{ReplicationStrategy}} object
to build replicaLayout, replicaPlan and CL liveness validation.
A test is added to prove that the false unavailable can be thrown when creating
the replicaPlan. (in the [first
commit|https://github.com/apache/cassandra/pull/954/commits/8d921c5d311c6e97d1f757af64a2e65a84b419ef])
The [second
commit|https://github.com/apache/cassandra/pull/954/commits/1b935280e09869736f334f67a72ed778ccfcdec7]
makes sure the same RS object is used for peer selection and CL liveness check
to avoid race.
However, {{blockFor}} calculation can still use a different RS object, leading
to that the coordinator blocks for a different condition as it originally
calculated for. The rest 2 commits address the problem.
The highlights of the patch:
* ReplicaLayout and ReplicaPlan now keep a reference to the replication
strategy snapshot. The snapshot is now used for peer selection, liveness
validation and blockFor calculation.
* The usage of Keyspace to validate CL liveness is fully eliminated to avoid
potential race. It uses replication strategy instead.
cc: [~aleksey][~cnlwsu]
was (Author: yifanc):
PR: https://github.com/apache/cassandra/pull/954
CI:
https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=CASSANDRA-16545%2Ftrunk
The patch is largely a refactor to pass the same {{ReplicationStrategy}} object
to build replicaLayout, replicaPlan and CL liveness validation.
A test is added to prove that the false unavailable can be thrown when creating
the replicaPlan. (in the [first
commit|https://github.com/apache/cassandra/pull/954/commits/8d921c5d311c6e97d1f757af64a2e65a84b419ef])
The [second commit|Use the same RS object during ReplicaPlan creation] makes
sure the same RS object is used for peer selection and CL liveness check to
avoid race.
However, {{blockFor}} calculation can still use a different RS object, leading
to that the coordinator blocks for a different condition as it originally
calculated for. The rest 2 commits address the problem.
The highlights of the patch:
* ReplicaLayout and ReplicaPlan now keep a reference to the replication
strategy snapshot. The snapshot is now used for peer selection, liveness
validation and blockFor calculation.
* The usage of Keyspace to validate CL liveness is fully eliminated to avoid
potential race. It uses replication strategy instead.
cc: [~aleksey][~cnlwsu]
> Cluster topology change may produce false unavailable for queries
> -----------------------------------------------------------------
>
> Key: CASSANDRA-16545
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16545
> Project: Cassandra
> Issue Type: Bug
> Components: Consistency/Coordination
> Reporter: Yifan Cai
> Assignee: Yifan Cai
> Priority: Normal
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When the coordinator processes a query, it first gets the
> {{ReplicationStrategy}} (RS) from the keyspace to decide the peers to
> contact. Again, it gets the RS to perform the liveness check for the
> requested CL.
> The RS is a volatile filed in Keyspace, and it is possible that those 2
> getter calls return different RS values in the presence of cluster topology
> changes, e.g. add a node, etc.
> In such scenario, the check at the second step can throw an unexpected
> unavailable. From the perspective of the query, the cluster can satisfy the
> CL.
> We should use a consistent view of RS during the peer selection and CL
> liveness check. In other word, both steps should reference to the same RS
> object. It is also more clear and easier to reason about to the clients. Such
> queries are made before the topology change.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]