[
https://issues.apache.org/jira/browse/CASSANDRA-15352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17045523#comment-17045523
]
Alex Petrov commented on CASSANDRA-15352:
-----------------------------------------
[~jjirsa] Right now, if there's an exception on the replica side, there's no
mechanism that would:
a) propagate the details of this failure/error from replica to coordinator
and, subsequently, client
b) help avoid waiting for enough replicas to respond before we can time out /
fail the request on the coordinator side
c) in transient replication, trigger rapid write protection as soon as replica
fails
I think a) is good for usability/visibility, since it makes it clear why
exactly query has failed. Thinking behind b) was that it can reduce coordinator
load since coordinator doesn't have to wait for the replicas that have failed,
but I admit this might be such a rare condition that it is likely not to
matter. c) is good because we can trigger rapid write protection and make a
write to the transient replica whenever coordinator learns about the exception
on the replica instead of waiting for a timeout.
[~benedict] you're right; not sure why I called it "cheap quorums". What I
meant was "rapid write protection". Right now, when replica doesn't respond
during the write, we'll trigger a write to transient replica. Even if write
request has failed on the full replica with an exception, information about
this won't be propagated to the coordinator. If we can notify the coordinator
about the exception on the full replica, we can trigger a write to a transient
replica potentially earlier.
> Replica failure propagation to coordinator and client
> -----------------------------------------------------
>
> Key: CASSANDRA-15352
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15352
> Project: Cassandra
> Issue Type: New Feature
> Components: Messaging/Internode
> Reporter: Alex Petrov
> Priority: Normal
>
> We should add early reporting of replica-side errors, since currently we just
> time-out requests. On normal read-write path this is not that important, but
> this is a protocol change we will need to improve cheap quorums for transient
> replication. This might have potential positive impact for regular read-write
> path, since we’ll be aborting queries early instead of timing them out. Can
> be useful for failing / going away nodes (which is also one of the changes
> we’re planning to implement).
> We do have means for propagating error both in client protocol through
> <reasonmap> and in internode through FAILURE_RSP, which is true and we do not
> have to extend the protocol to implement this change, but this is still a
> change in protocol behavior, since we’ll be sending a message where we would
> usually silently timeout.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]