[
https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14055174#comment-14055174
]
T Jake Luciani commented on CASSANDRA-7402:
-------------------------------------------
[~rbranson] Yeah that would certainly be ideal. The tricky part for us is we
also have to deal with RF per keyspace and non CL.ONE reads will affect
multiple nodes. My experience was the aggregate traffic was the harder part.
Even when you have good clients you can run out of heap when traffic spikes
cluster wide and client + replica requests chew up the memory.
> limit the on heap memory available to requests
> ----------------------------------------------
>
> Key: CASSANDRA-7402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7402
> Project: Cassandra
> Issue Type: Improvement
> Reporter: T Jake Luciani
> Fix For: 3.0
>
>
> When running a production cluster one common operational issue is quantifying
> GC pauses caused by ongoing requests.
> Since different queries return varying amount of data you can easily get your
> self into a situation where you Stop the world from a couple of bad actors in
> the system. Or more likely the aggregate garbage generated on a single node
> across all in flight requests causes a GC.
> We should be able to set a limit on the max heap we can allocate to all
> outstanding requests and track the garbage per requests to stop this from
> happening. It should increase a single nodes availability substantially.
> In the yaml this would be
> {code}
> total_request_memory_space_mb: 400
> {code}
> It would also be nice to have either a log of queries which generate the most
> garbage so operators can track this. Also a histogram.
--
This message was sent by Atlassian JIRA
(v6.2#6252)