[ 
https://issues.apache.org/jira/browse/CASSANDRA-16545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17315860#comment-17315860
 ] 

Yifan Cai edited comment on CASSANDRA-16545 at 4/6/21, 9:56 PM:
----------------------------------------------------------------

PR: https://github.com/apache/cassandra/pull/954
CI: 
https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=CASSANDRA-16545%2Ftrunk

The patch is largely a refactor to pass the same {{ReplicationStrategy}} object 
to build replicaLayout, replicaPlan and CL liveness validation. 
A test is added to prove that the false unavailable can be thrown when creating 
the replicaPlan. (in the [first 
commit|https://github.com/apache/cassandra/pull/954/commits/8d921c5d311c6e97d1f757af64a2e65a84b419ef])
The [second 
commit|https://github.com/apache/cassandra/pull/954/commits/1b935280e09869736f334f67a72ed778ccfcdec7]
 makes sure the same RS object is used for peer selection and CL liveness check 
to avoid race. 
However, {{blockFor}} calculation can still use a different RS object, leading 
to that the coordinator blocks for a different condition as it originally 
calculated for. The rest 2 commits address the problem. 

The highlights of the patch:
* ReplicaLayout and ReplicaPlan now keep a reference to the replication 
strategy snapshot. The snapshot is now used for peer selection, liveness 
validation and blockFor calculation. 
* The usage of Keyspace to validate CL liveness is fully eliminated to avoid 
potential race. It uses replication strategy instead. 

cc: [~aleksey][~cnlwsu]


was (Author: yifanc):
PR: https://github.com/apache/cassandra/pull/954
CI: 
https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=CASSANDRA-16545%2Ftrunk

The patch is largely a refactor to pass the same {{ReplicationStrategy}} object 
to build replicaLayout, replicaPlan and CL liveness validation. 
A test is added to prove that the false unavailable can be thrown when creating 
the replicaPlan. (in the [first 
commit|https://github.com/apache/cassandra/pull/954/commits/8d921c5d311c6e97d1f757af64a2e65a84b419ef])
The [second commit|Use the same RS object during ReplicaPlan creation] makes 
sure the same RS object is used for peer selection and CL liveness check to 
avoid race. 
However, {{blockFor}} calculation can still use a different RS object, leading 
to that the coordinator blocks for a different condition as it originally 
calculated for. The rest 2 commits address the problem. 

The highlights of the patch:
* ReplicaLayout and ReplicaPlan now keep a reference to the replication 
strategy snapshot. The snapshot is now used for peer selection, liveness 
validation and blockFor calculation. 
* The usage of Keyspace to validate CL liveness is fully eliminated to avoid 
potential race. It uses replication strategy instead. 

cc: [~aleksey][~cnlwsu]

> Cluster topology change may produce false unavailable for queries
> -----------------------------------------------------------------
>
>                 Key: CASSANDRA-16545
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16545
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination
>            Reporter: Yifan Cai
>            Assignee: Yifan Cai
>            Priority: Normal
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When the coordinator processes a query, it first gets the 
> {{ReplicationStrategy}} (RS) from the keyspace to decide the peers to 
> contact. Again, it gets the RS to perform the liveness check for the 
> requested CL. 
> The RS is a volatile filed in Keyspace, and it is possible that those 2 
> getter calls return different RS values in the presence of cluster topology 
> changes, e.g. add a node, etc. 
> In such scenario, the check at the second step can throw an unexpected 
> unavailable. From the perspective of the query, the cluster can satisfy the 
> CL. 
> We should use a consistent view of RS during the peer selection and CL 
> liveness check. In other word, both steps should reference to the same RS 
> object. It is also more clear and easier to reason about to the clients. Such 
> queries are made before the topology change. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to