[ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535402#comment-14535402
 ] 

Benedict commented on CASSANDRA-9318:
-------------------------------------

bq. We see lots of production deployments that log occasional messages about 
load shedding being triggered

Regrettably this log message doesn't indicate the load shedding was 
sufficiently useful. There could be no load shedding for several minutes, or an 
arbitrary interval, then for it to happen en masse, so seeing this message only 
indicates load shedding was needed, and it fortunately happened in time to 
prevent the node failing, but not that it generally is capable of preventing 
the node failing.

bq. Again – and I apologize if this was already clear

Apologies; I did not fully digest this aspect of your most recent response 
before responding myself. I'm not at all opposed to that, but I'm not sure why 
it isn't already preventing these problems from occurring in our simple tests? 
It currently bounds the number of in flight requests low enough we should be 
seeing these overloaded exceptions during a lengthy flush on another node, but 
we don't (again, referring to ariel's recent test as an example).

I don't really see them as in conflict with one another, but it seems like this 
is something that could be lowered by operators? Especially with Jake's patch 
providing histograms of request size. What changes are you proposing? 

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 2.1.x
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to