[
https://issues.apache.org/jira/browse/CASSANDRA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982023#action_12982023
]
Stu Hood commented on CASSANDRA-1988:
-------------------------------------
In particular, on CASSANDRA-1964, the lag between the connection dying and
gossip noticing the death was up to 20 seconds, meaning we received
TimeoutException twice before getting an UnavailableException.
> Prefer to throw Unavailable rather than Timeout
> -----------------------------------------------
>
> Key: CASSANDRA-1988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1988
> Project: Cassandra
> Issue Type: Improvement
> Components: API
> Reporter: Stu Hood
> Fix For: 0.8
>
>
> When a node is unreachable, but is not yet being reported dead by gossip,
> messages are enqueued in the messaging service to be sent when the node
> becomes available again (on the assumption that the connection dropped
> temporarily).
> Higher up in the client layer, before sending messages to other nodes, we
> check that they are alive according to gossip, and fail fast with
> UnavailableException if they are not (CASSANDRA-1803). If we send messages to
> nodes that are not yet being reported dead, the messages sit in queue, and
> time out rather than being sent: this results in the client request failing
> with a TimeoutException.
> If we differentiate between messages that were never sent (aka, are still
> queued in the MessagingService at the end of the timeout), and messages that
> were sent but didn't get a response, we can properly throw
> UnavailableException in the former case.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.