[ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091963#comment-15091963
 ] 

Jonathan Ellis edited comment on CASSANDRA-9318 at 1/11/16 2:27 PM:
--------------------------------------------------------------------

Since most of that discussion is implementation details, I'll quote the 
relevant part:

bq. With consistency level less then ALL mutation processing can move to 
background (meaning client was answered, but there is still work to do on 
behalf of the request). If background request rate completion is lower than 
incoming request rate background request will accumulate and eventually will 
exhaust all memory resources. This patch's aim is to prevent this situation by 
monitoring how much memory all current background request take and when some 
threshold is passed stop moving request to background (by not replying to a 
client until either memory consumptions moves below the threshold or request is 
fully completed).

bq. There are two main point where each background mutation consumes memory: 
holding frozen mutation until operation is complete in order to hint if it does 
not) and on rpc queue to each replica where it sits until it's sent out on the 
wire. The patch accounts for both of those separately and limits the former to 
be 10% of total memory and the later to be 6M. Why 6M? The best answer I can 
give is why not :) But on a more serious note the number should be small enough 
so that all the data can be sent out in a reasonable amount of time and one 
shard is not capable to achieve even close to a full bandwidth, so empirical 
evidence shows 6M to be a good number. 


was (Author: jbellis):
Since most of that discussion is implementation details, I'll quote the 
relevant part:

bq. With consistency level less then ALL mutation processing can move to 
background (meaning client was answered, but there is still work to do on 
behalf of the request). If background request rate completion is lower than 
incoming request rate background request will accumulate and eventually will 
exhaust all memory resources. This patch's aim is to prevent this situation by 
monitoring how much memory all current background request take and when some 
threshold is passed stop moving request to background (by not replying to a 
client until either memory consumptions moves below the threshold or request is 
fully completed).

bq. There are two main point where each background mutation consumes memory: 
holding frozen mutation until operation is complete in order to hint 
if it does not) and on rpc queue to each replica where it sits until it's sent 
out on the wire. The patch accounts for both of those separately and limits the 
former to be 10% of total memory and the later to be 6M. Why 6M? The best 
answer I can give is why not :) But on a more serious note the number should be 
small enough so that all the data can be sent out in a reasonable amount of 
time and one shard is not capable to achieve even close to a full bandwidth, so 
empirical evidence shows 6M to be a good number. 

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths, Streaming and Messaging
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 2.1.x, 2.2.x
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to