[
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535402#comment-14535402
]
Benedict commented on CASSANDRA-9318:
-------------------------------------
bq. We see lots of production deployments that log occasional messages about
load shedding being triggered
Regrettably this log message doesn't indicate the load shedding was
sufficiently useful. There could be no load shedding for several minutes, or an
arbitrary interval, then for it to happen en masse, so seeing this message only
indicates load shedding was needed, and it fortunately happened in time to
prevent the node failing, but not that it generally is capable of preventing
the node failing.
bq. Again – and I apologize if this was already clear
Apologies; I did not fully digest this aspect of your most recent response
before responding myself. I'm not at all opposed to that, but I'm not sure why
it isn't already preventing these problems from occurring in our simple tests?
It currently bounds the number of in flight requests low enough we should be
seeing these overloaded exceptions during a lengthy flush on another node, but
we don't (again, referring to ariel's recent test as an example).
I don't really see them as in conflict with one another, but it seems like this
is something that could be lowered by operators? Especially with Jake's patch
providing histograms of request size. What changes are you proposing?
> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
> Key: CASSANDRA-9318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Ariel Weisberg
> Assignee: Ariel Weisberg
> Fix For: 2.1.x
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding
> bytes and requests and if it reaches a high watermark disable read on client
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't
> introduce other issues.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)