[
https://issues.apache.org/jira/browse/CASSANDRA-15352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17042515#comment-17042515
]
Benedict Elliott Smith edited comment on CASSANDRA-15352 at 2/22/20 11:11 AM:
------------------------------------------------------------------------------
When the query is already known to have failed, but the client simply waits
until timeout to receive notification (of a timeout, not failure); and in
particular, those where a replica has failed to serve its part of the work but
does not report it back to the coordinator, I think?
Though admittedly I'm unclear of its interaction with cheap quorums (which
cannot reasonably be informed by replica information, since the cheap quorum is
only needed if the replica doesn't respond, though there may be a subset of
cases where the replica is unable to perform the work but is able to respond)
was (Author: benedict):
When the query is already known to have failed, but the client simply waits
until timeout to receive notification (of a timeout, not failure)
> Replica failure propagation to coordinator and client
> -----------------------------------------------------
>
> Key: CASSANDRA-15352
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15352
> Project: Cassandra
> Issue Type: New Feature
> Components: Messaging/Internode
> Reporter: Alex Petrov
> Priority: Normal
>
> We should add early reporting of replica-side errors, since currently we just
> time-out requests. On normal read-write path this is not that important, but
> this is a protocol change we will need to improve cheap quorums for transient
> replication. This might have potential positive impact for regular read-write
> path, since we’ll be aborting queries early instead of timing them out. Can
> be useful for failing / going away nodes (which is also one of the changes
> we’re planning to implement).
> We do have means for propagating error both in client protocol through
> <reasonmap> and in internode through FAILURE_RSP, which is true and we do not
> have to extend the protocol to implement this change, but this is still a
> change in protocol behavior, since we’ll be sending a message where we would
> usually silently timeout.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]