[ 
https://issues.apache.org/jira/browse/CASSANDRA-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505784#comment-16505784
 ] 

Joseph Lynch commented on CASSANDRA-14459:
------------------------------------------

Ok, I've pushed another version of the patch to that branch which:
 # Adds a guaranteed {{EchoMessage}} to live hosts in addition to the GossipSyn 
request during the first step of gossip so that we can get some latency 
measurements from even latent nodes at some point. Increasing the gossip 
messaging slightly concerns me so the other option is I can have DES send 
explicit {{EchoMessages}} when it notices that a host doesn't have any data in 
{{reset}}. That is more deterministic (we can guarantee that after 2 reset 
intervals we'll probe the host), but also has the DES actively sending 
messages...
 # Creates a JMX method on {{DynamicEndpointSnitchMBean}} to allow users to 
force timing resets (if someone wants the old behavior back they can just call 
it on a cron ;)).

I've been playing around with a local CCM cluster using `netem` to delay 
traffic to a particular localhost node with a small (~5s) reset interval and 
testing the reset logic out and it appears to work well. The only issue I ran 
into is that if a node is really fast once and then it becomes slow it will get 
some traffic after every reset because we reset to the fast measurement. This 
is no worse than the status quo but I tried to mitigate it by special casing a 
host which only has two measurements (a fast and a subsequent slow one) to use 
the mean instead of the minimum which eventually converges either up or down to 
the new RTT.

> DynamicEndpointSnitch should never prefer latent nodes
> ------------------------------------------------------
>
>                 Key: CASSANDRA-14459
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14459
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Coordination
>            Reporter: Joseph Lynch
>            Assignee: Joseph Lynch
>            Priority: Minor
>
> The DynamicEndpointSnitch has two unfortunate behaviors that allow it to 
> provide latent hosts as replicas:
>  # Loses all latency information when Cassandra restarts
>  # Clears latency information entirely every ten minutes (by default), 
> allowing global queries to be routed to _other datacenters_ (and local 
> queries cross racks/azs)
> This means that the first few queries after restart/reset could be quite slow 
> compared to average latencies. I propose we solve this by resetting to the 
> minimum observed latency instead of completely clearing the samples and 
> extending the {{isLatencyForSnitch}} idea to a three state variable instead 
> of two, in particular {{YES}}, {{NO}}, {{MAYBE}}. This extension allows 
> {{EchoMessages}} and {{PingMessages}} to send {{MAYBE}} indicating that the 
> DS should use those measurements if it only has one or fewer samples for a 
> host. This fixes both problems because on process restart we send out 
> {{PingMessages}} / {{EchoMessages}} as part of startup, and we would reset to 
> effectively the RTT of the hosts (also at that point normal gossip 
> {{EchoMessages}} have an opportunity to add an additional latency 
> measurement).
> This strategy also nicely deals with the "a host got slow but now it's fine" 
> problem that the DS resets were (afaik) designed to stop because the 
> {{EchoMessage}} ping latency will count only after the reset for that host. 
> Ping latency is a more reasonable lower bound on host latency (as opposed to 
> status quo of zero).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to