[
https://issues.apache.org/jira/browse/CASSANDRA-11738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387873#comment-15387873
]
Jeremiah Jordan commented on CASSANDRA-11738:
---------------------------------------------
One nit. The code in getSeverity was just relocated from
BackgroundActivityMonitor but it would be my preference to keep "=" out of the
if check in DES.
{code}
VersionedValue event;
if (state != null && (event =
state.getApplicationState(ApplicationState.SEVERITY)) != null)
{code}
to
{code}
if (state != null)
{
VersionedValue event =
state.getApplicationState(ApplicationState.SEVERITY)
if (event != null)
{
{code}
Don't care too strongly about it, but I think it makes the code more readable
to pull it out.
I started cassci run here:
http://cassci.datastax.com/view/Dev/view/zanson/job/JeremiahDJordan-11738-dtest/
http://cassci.datastax.com/view/Dev/view/zanson/job/JeremiahDJordan-11738-testall/
> Re-think the use of Severity in the DynamicEndpointSnitch calculation
> ---------------------------------------------------------------------
>
> Key: CASSANDRA-11738
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11738
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jeremiah Jordan
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 3.x
>
> Attachments: 11738.txt
>
>
> CASSANDRA-11737 was opened to allow completely disabling the use of severity
> in the DynamicEndpointSnitch calculation, but that is a pretty big hammer.
> There is probably something we can do to better use the score.
> The issue seems to be that severity is given equal weight with latency in the
> current code, also that severity is only based on disk io. If you have a
> node that is CPU bound on something (say catching up on LCS compactions
> because of bootstrap/repair/replace) the IO wait can be low, but the latency
> to the node is high.
> Some ideas I had are:
> 1. Allowing a yaml parameter to tune how much impact the severity score has
> in the calculation.
> 2. Taking CPU load into account as well as IO Wait (this would probably help
> in the cases I have seen things go sideways)
> 3. Move the -D from CASSANDRA-11737 to being a yaml level setting
> 4. Go back to just relying on Latency and get rid of severity all together.
> Now that we have rapid read protection, maybe just using latency is enough,
> as it can help where the predictive nature of IO wait would have been useful.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)