[ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062162#comment-15062162
 ] 

Ariel Weisberg commented on CASSANDRA-9318:
-------------------------------------------

To summarize what I have been experimenting with. I have been limiting disk 
throughput with a rate limiter both with and without limiting the commit log. I 
am excluding the commitlog because I don't want to bottleneck the ingest rate. 
I have also been experimenting with limiting the rate at which remote delivered 
messages are consumed which more or less forces the mutations to time out.

I added hint backpressure and since then I have been unable to make the server 
fail or OOM from slow disk or slow remote nodes. What I have found is that I am 
unable to get the server to start enough concurrent requests to OOM in the disk 
bound case because the request stage is bounded to 128 requests. 

In the remote mutation bound case I think I am running out of ingest capacity 
on a single node and that is preventing me from running up more requests in 
memory faster than they can be evicted by the 2 second timeout. With gigabit 
ethernet the reason is obvious. With 10-gigabit ethernet I would still be 
limited to 1.6 gigabytes of data assuming timeouts are prompt. That's a hefty 
quantity, but something that could fit in a larger heap. I should be able to 
test that soon.

I noticed OutboundTcpConnection is another potential source of OOM since it can 
only drop messages if the thread isn't blocked writing to the socket. I am 
allowing it to wakeup regularly in my testing, but if something is unavailable 
we would have to not OOM before failure detection kicks in which is a longer 
timespan. It might make sense to have another thread periodically check if it 
should expire messages from the queue. This isn't something I can test for 
without access to 10-gig.
have coordinate several hundred (or thousand?) megabytes of transaction data to 
OOM. This assumes you can get past the ingest limitations of gigabit ethernet.

For code changes I think that disabling reads based on memory pressure is 
mostly harmless, but I can't demonstrate that it adds a lot of value given how 
things behave with a large enough heap and the network level bounds on ingest 
rate.

I do want to add backpressure to hints and the compressed commit log because 
that is an easy path to OOM. I'll package up those changes up today.

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths, Streaming and Messaging
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 2.1.x, 2.2.x
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to