[ 
https://issues.apache.org/jira/browse/CASSANDRA-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate McCall updated CASSANDRA-6908:
-----------------------------------
    Reproduced In: 2.1.9, 2.1.7  (was: 2.1.9)

> Dynamic endpoint snitch destabilizes cluster under heavy load
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-6908
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6908
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Config, Core
>            Reporter: Bartłomiej Romański
>         Attachments: as-dynamic-snitch-disabled.png
>
>
> We observe that with dynamic snitch disabled our cluster is much more stable 
> than with dynamic snitch enabled.
> We've got a 15 nodes cluster with pretty strong machines (2xE5-2620, 64 GB 
> RAM, 2x480 GB SSD). We mostly do reads (about 300k/s).
> We use Astyanax on client side with TOKEN_AWARE option enabled. It 
> automatically direct read queries to one of the nodes responsible the given 
> token.
> In that case with dynamic snitch disabled Cassandra always handles read 
> locally. With dynamic snitch enabled Cassandra very often decides to proxy 
> the read to some other node. This causes much higher CPU usage and produces 
> much more garbage what results in more often GC pauses (young generation 
> fills up quicker). By "much higher" and "much more" I mean 1.5-2x.
> I'm aware that higher dynamic_snitch_badness_threshold value should solve 
> that issue. The default value is 0.1. I've looked at scores exposed in JMX 
> and the problem is that our values seemed to be completely random. They are 
> between usually 0.5 and 2.0, but changes randomly every time I hit refresh.
> Of course, I can set dynamic_snitch_badness_threshold to 5.0 or something 
> like that, but the result will be similar to simply disabling the dynamic 
> switch at all (that's what we done).
> I've tried to understand what's the logic behind these scores and I'm not 
> sure if I get the idea...
> It's a sum (without any multipliers) of two components:
> - ratio of recent given node latency to recent average node latency
> - something called 'severity', what, if I analyzed the code correctly, is a 
> result of BackgroundActivityMonitor.getIOWait() - it's a ratio of "iowait" 
> CPU time to the whole CPU time as reported in /proc/stats (the ratio is 
> multiplied by 100)
> In our case the second value is something around 0-2% but varies quite 
> heavily every second.
> What's the idea behind simply adding this two values without any multipliers 
> (e.g the second one is in percentage while the first one is not)? Are we sure 
> this is the best possible way of calculating the final score?
> Is there a way too force Cassandra to use (much) longer samples? In our case 
> we probably need that to get stable values. The 'severity' is calculated for 
> each second. The mean latency is calculated based on some magic, hardcoded 
> values (ALPHA = 0.75, WINDOW_SIZE = 100). 
> Am I right that there's no way to tune that without hacking the code?
> I'm aware that there's dynamic_snitch_update_interval_in_ms property in the 
> config file, but that only determines how often the scores are recalculated 
> not how long samples are taken. Is that correct?
> To sum up, It would be really nice to have more control over dynamic snitch 
> behavior or at least have the official option to disable it described in the 
> default config file (it took me some time to discover that we can just 
> disable it instead of hacking with dynamic_snitch_badness_threshold=1000).
> Currently for some scenarios (like ours - optimized cluster, token aware 
> client, heavy load) it causes more harm than good.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to