[
https://issues.apache.org/jira/browse/CASSANDRA-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424048#comment-13424048
]
Brandon Williams commented on CASSANDRA-3533:
---------------------------------------------
I'll also note that yes, everything can be ok and UE will be thrown (the
connection just hasn't established yet, but will on OTC's next attempt) but
penalizing the client ~100ms to find out instead of just failing out and
letting them try another coordinator seems like an improvement.
> TimeoutException when there is a firewall issue.
> ------------------------------------------------
>
> Key: CASSANDRA-3533
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3533
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Vijay
> Assignee: Brandon Williams
> Priority: Minor
> Fix For: 1.1.4
>
> Attachments: 3533.txt
>
>
> When one node in the cluster is not able to talk to the other DC/RAC due to
> firewall or network related issue (StorageProxy calls fail), and the nodes
> are NOT marked down because at least one node in the cluster can talk to the
> other DC/RAC, we get timeoutException instead of throwing a
> unavailableException.
> The problem with this:
> 1) It is hard to monitor/identify these errors.
> 2) It is hard to diffrentiate from the client if the node being bad vs a bad
> query.
> 3) when this issue happens we have to wait for at-least the RPC timeout time
> to know that the query wont succeed.
> Possible Solution: when marking a node down we might want to check if the
> node is actually alive by trying to communicate to it? So we can be sure that
> the node is actually alive.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira