[ 
https://issues.apache.org/jira/browse/CASSANDRA-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Gallardo updated CASSANDRA-15642:
---------------------------------------
    Description: 
As a follow up to some exploration I have done for CASSANDRA-15543, I realized 
the following behavior in both {{ReadCallback}} and {{AbstractWriteHandler}}:
 - await for responses
 - when all required number of responses have come back: unblock the wait
 - when a single failure happens: unblock the wait
 - when unblocked, look to see if the counter of failures is > 1 and if so 
return an error message based on the {{failures}} map that's been filled

Error messages that can result from this behavior can be a ReadTimeout, a 
ReadFailure, a WriteTimeout or a WriteFailure.

In case of a Write/ReadFailure, the user will get back an error looking like 
the following:

"Failure: Received X responses, and Y failures"

(if this behavior I describe is incorrect, please correct me)

This causes a usability problem. Since the handler will fail and throw an 
exception as soon as 1 failure happens, the error message that is returned to 
the user may not be accurate.

(note: I am not entirely sure of the behavior in case of timeouts for now)

At, say, CL = QUORUM = 3, the failed request may complete first, then a 
successful one completes, and another fails. If the exception is thrown fast 
enough, the error message could say 
 "Failure: Received 0 response, and 1 failure at CL = 3"

Which 1. doesn't make a lot of sense because the CL doesn't match the previous 
information, but 2. the information is incorrect. We received a successful 
response, only it came after the initial failure.

>From that logic, I think it is safe to assume that the information returned in 
>the error message cannot be trusted in case of a failure. We can only know 
>that at least 1 node has failed, or not if the response is successful.

I am suggesting that for a big improvement in usability, the ReadCallback and 
AbstractWriteResponseHandler wait for all responses to come back before 
unblocking the wait, or let it timeout. This is way, the users will be able to 
have some trust around the numbers returned to them. Also we would be able to 
return more information this way.

Right now, an error that happens first prevents from a timeout to happen 
because it fails immediately, and so potentially it hides problems with other 
replicas. If we were to wait for all responses, we might get a timeout, in that 
case we'd also be able to tell wether failures have happened *before* that 
timeout, and have a more complete view where you can't detect both situations.

  was:
As a follow up to some exploration I have done for CASSANDRA-15543, I realized 
the following behavior in both {{ReadCallback}} and {{AbstractWriteHandler}}:
 - await for responses
 - when all required number of responses have come back: unblock the wait
 - when a single failure happens: unblock the wait
 - when unblocked, look to see if the counter of failures is > 1 and if so 
return an error message based on the {{failures}} map that's been filled

Error messages that can result from this behavior can be a ReadTimeout, a 
ReadFailure, a WriteTimeout or a WriteFailure.

In case of a Write/ReadFailure, the user will get back an error looking like 
the following:

"Failure: Received X responses, and Y failures"

(if this behavior I describe is incorrect, please correct me)

This causes a usability problem. Since the handler will fail and throw an 
exception as soon as 1 failure happens, the error message that is returned to 
the user is not accurate.

(note: I am not entirely sure of the behavior around timeouts for now)

At, say, CL = QUORUM = 3, the failed request may complete first, then a 
successful one completes, and another fails. If the exception is thrown fast 
enough, the error message could say 
 "Failure: Received 0 response, and 1 failure at CL = 3"

Which 1. doesn't make a lot of sense because the CL doesn't match the previous 
information, but 2. the information is incorrect. We received a successful 
response, only it came after the initial failure.

>From that logic, I think it is safe to assume that the information returned in 
>the error message cannot be trusted in case of a failure. We can only know 
>that at least 1 node has failed, or not if the response is successful.

I am suggesting that for a big improvement in usability, the ReadCallback and 
AbstractWriteResponseHandler wait for all responses to come back before 
unblocking the wait, or let it timeout. This is way, the users will be able to 
have some trust around the numbers returned to them. Also we would be able to 
return more information this way.

Right now, an error that happens first prevents from a timeout to happen 
because it fails immediately, and so potentially it hides problems with other 
replicas. If we were to wait for all responses, we might get a timeout, in that 
case we'd also be able to tell wether failures have happened *before* that 
timeout, and have a more complete view where you can't detect both situations.


> Inconsistent failure messages on distributed queries
> ----------------------------------------------------
>
>                 Key: CASSANDRA-15642
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15642
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Consistency/Coordination
>            Reporter: Kevin Gallardo
>            Priority: Normal
>
> As a follow up to some exploration I have done for CASSANDRA-15543, I 
> realized the following behavior in both {{ReadCallback}} and 
> {{AbstractWriteHandler}}:
>  - await for responses
>  - when all required number of responses have come back: unblock the wait
>  - when a single failure happens: unblock the wait
>  - when unblocked, look to see if the counter of failures is > 1 and if so 
> return an error message based on the {{failures}} map that's been filled
> Error messages that can result from this behavior can be a ReadTimeout, a 
> ReadFailure, a WriteTimeout or a WriteFailure.
> In case of a Write/ReadFailure, the user will get back an error looking like 
> the following:
> "Failure: Received X responses, and Y failures"
> (if this behavior I describe is incorrect, please correct me)
> This causes a usability problem. Since the handler will fail and throw an 
> exception as soon as 1 failure happens, the error message that is returned to 
> the user may not be accurate.
> (note: I am not entirely sure of the behavior in case of timeouts for now)
> At, say, CL = QUORUM = 3, the failed request may complete first, then a 
> successful one completes, and another fails. If the exception is thrown fast 
> enough, the error message could say 
>  "Failure: Received 0 response, and 1 failure at CL = 3"
> Which 1. doesn't make a lot of sense because the CL doesn't match the 
> previous information, but 2. the information is incorrect. We received a 
> successful response, only it came after the initial failure.
> From that logic, I think it is safe to assume that the information returned 
> in the error message cannot be trusted in case of a failure. We can only know 
> that at least 1 node has failed, or not if the response is successful.
> I am suggesting that for a big improvement in usability, the ReadCallback and 
> AbstractWriteResponseHandler wait for all responses to come back before 
> unblocking the wait, or let it timeout. This is way, the users will be able 
> to have some trust around the numbers returned to them. Also we would be able 
> to return more information this way.
> Right now, an error that happens first prevents from a timeout to happen 
> because it fails immediately, and so potentially it hides problems with other 
> replicas. If we were to wait for all responses, we might get a timeout, in 
> that case we'd also be able to tell wether failures have happened *before* 
> that timeout, and have a more complete view where you can't detect both 
> situations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to