[ 
https://issues.apache.org/jira/browse/CASSANDRA-3533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13425410#comment-13425410
 ] 

Brandon Williams commented on CASSANDRA-3533:
---------------------------------------------

bq. Is there anything forcing a next attempt though, besides gossip (1/N chance 
per round)?

Hmm, actually, no, I was mistaken there.

bq. But you still have things like GC-based "flapping" that can cause FD to 
mark a node down over-pessimistically. So I don't think I buy that this is an 
argument for not making FD more robust – since we already have to deal with "FD 
is too pessimistic" for this case.

I actually don't think, at least for this example, being overly pessimistic is 
an issue.  On a healthy network (0.3ms ping) it takes 18-19s for the FD to mark 
a host down with the default phi.  If the GC flapping is so bad it can't get a 
gossip change out in that time, the node probably _should_ be marked down.

bq. (Fundamentally though I don't think we'll get much mileage out of trying to 
second-guess FD, so I'd rather make FD as accurate as we can. And I suspect 
that "StorageProxy uses FD-supplemented-by-X and the rest of the system using 
normal FD is going to cause weirdness.)

You're probably right.  Let's take a step back and examine what we're trying to 
solve.  Node X can talk to Y, Y can talk to Z, but X and Z are partitioned and 
can't communicate, but surrogate gossip traffic via Y makes them both think 
they can.  The fallout from this is that they'll keep attempting to send 
messages (and thus connect) to each other.  In practice though, from a client 
perspective:

* writes will get ack'd by whichever replicas respond the fastest.  Assuming 
RF=3 and X being the coordinator, the fact that it wrote a local copy and Y 
responded is enough for everything but ALL.

* reads will get attempted against Z from X, and will have to timeout.

Now let's look at the read scenario in a post-1.2 world.  The dsnitch, after 
CASSANDRA-3722, will penalize Z in X's eyes much faster (and thus prevent 
dogpiling requests while waiting for rpc timeout) than pre-1.2 and quit trying 
to use it (at least until the reset interval, then the process begins again.)  
But this is really no different than if Z _does_ suddenly die at such a level 
that the network route is a black hole (like force suspending the JVM, which is 
how the dsnitch change was tested and worked well.)

So I suppose my question is, what is the problem here we still need to solve?
                
> TimeoutException when there is a firewall issue.
> ------------------------------------------------
>
>                 Key: CASSANDRA-3533
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3533
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Vijay
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: 3533.txt
>
>
> When one node in the cluster is not able to talk to the other DC/RAC due to 
> firewall or network related issue (StorageProxy calls fail), and the nodes 
> are NOT marked down because at least one node in the cluster can talk to the 
> other DC/RAC, we get timeoutException instead of throwing a 
> unavailableException.
> The problem with this:
> 1) It is hard to monitor/identify these errors.
> 2) It is hard to diffrentiate from the client if the node being bad vs a bad 
> query.
> 3) when this issue happens we have to wait for at-least the RPC timeout time 
> to know that the query wont succeed.
> Possible Solution: when marking a node down we might want to check if the 
> node is actually alive by trying to communicate to it? So we can be sure that 
> the node is actually alive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to