[ 
https://issues.apache.org/jira/browse/CASSANDRA-15350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16962278#comment-16962278
 ] 

Yifan Cai commented on CASSANDRA-15350:
---------------------------------------

In the current {{cas}} implementation, WriteTimeoutExceptions 
({{WriteType.CAS}}) are thrown under the following scenarios
 * The overall {{cas}} operation times out.
 * The PREPARE phase times out.
 ** Multiple unsuccessful retries and eventually times out
 ** RPC requests with nodes time out. (networking)
 ** Multiple proposers contend.Each proposer get promise from the majority and 
pre-empt the other proposers from proceeding to PROPOSE phase. When the other 
proposers (thinking they are still the winners, but in fact not) send proposal, 
they gets rejections from *ALL* acceptors. Such contention continues and time 
runs out.
 ** A repair attempt is added in this phase.
 *** Propose to replay the previous accepted update timeouts
 *** Commit the update timeout
 * The PROPOSE phase times out.
 ** RPC requests with nodes time out. (networking)
 ** Send proposal to *ALL* acceptors and wait,
 *** If successful, i.e. majority accepts, we are good.
 *** If *all* acceptors rejects, it is safe for the proposer to re-submit the 
proposal with a higher ballot.
 *** {color:#ff8b00}If some but *not quorum* accepts, the proposal may or may 
not be replayed by new proposers. (Uncertainty){color}
 **** If the new proposer reaches to the acceptors that accepted the old 
proposal, it replays the proposal when it is the most recent in-progress one.
 **** If the new proposer does not reach to those acceptors, it is free for the 
new proposer to choose a value and possibly making the earlier proposal to not 
be qualified for replaying.
 * The COMMIT phase times out.
 ** Apply update times out. Note that this is a normal write. The WriteType is 
{{SIMPLE}} instead of {{CAS}}
 ** Only exception is when the timeout is from the repair attempt in the 
PREPARE phase. In this case, the WriteType is overridden to {{CAS}}

Most of the timeouts are genuine in the list, except the one colored in 
{color:#ff8b00}orange{color}.

> Add CAS “uncertainty” and “contention" messages that are currently propagated 
> as a WriteTimeoutException.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15350
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15350
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Feature/Lightweight Transactions
>            Reporter: Alex Petrov
>            Priority: Normal
>              Labels: client-impacting, protocolv5
>
> Right now, CAS uncertainty introduced in 
> https://issues.apache.org/jira/browse/CASSANDRA-6013 is propagating as 
> WriteTimeout. One of this conditions it manifests is when there’s at least 
> one acceptor that has accepted the value, which means that this value _may_ 
> still get accepted during the later round, despite the proposer failure. 
> Similar problem happens with CAS contention, which is also indistinguishable 
> from the “regular” timeout, even though it is visible in metrics correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to