[ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371010#comment-15371010
 ] 

Jonathan Ellis commented on CASSANDRA-9318:
-------------------------------------------

If I understand this approach correctly, you're looking at the ratio of sent to 
acknowledged writes per replica, and throwing Unavailable if that gets too low 
for a given replica.  Very clever.

One thing that worries me is, how do you distinguish between “node X is slow 
because we are writing too fast and we need to throttle clients down” and “node 
X is slow because it is dying, we need to ignore it and accept writes based on 
other replicas?”

I.e. this seems to implicitly push everyone to a kind of CL.ALL model once your 
threshold triggers, where if one replica is slow then we can't make progress.

If we take a simpler approach of just bounding total outstanding requests to 
all replicas per coordinator, then we can avoid overload meltdown while 
allowing CL to continue to work as designed and tolerate as many slow replicas 
as the requested CL permits.

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths, Streaming and Messaging
>            Reporter: Ariel Weisberg
>            Assignee: Sergio Bossa
>         Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, 
> limit.btm, no_backpressure.png
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to