[
https://issues.apache.org/jira/browse/CASSANDRA-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157394#comment-13157394
]
Jonathan Ellis commented on CASSANDRA-3533:
-------------------------------------------
I'd be curious if any of the other Dynamo-derived systems (Voldemort, Riak, ?)
attempt to deal with this. It's not clear to me how we should try to handle
incomplete network graphs (A can talk to B and to C, but C can't talk to B).
> TimeoutException when there is a firewall issue.
> ------------------------------------------------
>
> Key: CASSANDRA-3533
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3533
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Affects Versions: 1.0.4
> Reporter: Vijay
> Priority: Minor
>
> When one node in the cluster is not able to talk to the other DC/RAC due to
> firewall or network related issue (StorageProxy calls fail), and the nodes
> are NOT marked down because at least one node in the cluster can talk to the
> other DC/RAC, we get timeoutException instead of throwing a
> unavailableException.
> The problem with this:
> 1) It is hard to monitor/identify these errors.
> 2) It is hard to diffrentiate from the client if the node being bad vs a bad
> query.
> 3) when this issue happens we have to wait for at-least the RPC timeout time
> to know that the query wont succeed.
> Possible Solution: when marking a node down we might want to check if the
> node is actually alive by trying to communicate to it? So we can be sure that
> the node is actually alive.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira