[jira] [Comment Edited] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator

Jonathan Ellis (JIRA) Wed, 13 Jul 2016 14:54:40 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375752#comment-15375752
 ]


Jonathan Ellis edited comment on CASSANDRA-9318 at 7/13/16 9:54 PM:
--------------------------------------------------------------------

bq. if we make the strategy a bit more generic as mentioned above so the 
decision is made from all replica involved (maybe the strategy should also keep 
track of the replica-state completely internally so we can implement basic 
strategy like having a simple high watermark very easy), and we make sure to 
not throttle too quickly (typically, if a single replica is slow and we don't 
really need it, start by just hinting him), then I'd be happy moving to the 
"actually test this" phase and see how it goes.

I suppose that's reasonable in principle, with some caveats:

# Throwing exceptions shouldn't be part of the API.  OverloadedException dates 
from the Thrift days, where our flow control options were very limited and this 
was the best we could do to tell clients, "back off."  Now that we have our own 
protocol and full control over Netty we should simply not read more requests 
until we shed some load.  (Since shedding load is a gradual process -- requests 
time out, we write hints, our load goes down -- clients will just perceive this 
as slowing down, which is what we want.)
# The API should provide for reporting load to clients so they can do real load 
balancing across coordinators and not just round-robin.
# Throttling requests to the speed of the slowest replica is not something we 
should ship, even as an option.



was (Author: jbellis):
bq. if we make the strategy a bit more generic as mentioned above so the 
decision is made from all replica involved (maybe the strategy should also keep 
track of the replica-state completely internally so we can implement basic 
strategy like having a simple high watermark very easy), and we make sure to 
not throttle too quickly (typically, if a single replica is slow and we don't 
really need it, start by just hinting him), then I'd be happy moving to the 
"actually test this" phase and see how it goes.

I suppose that's reasonable in principle, with some caveats:

# Throwing exceptions shouldn't be part of the API.  OverloadedException dates 
from the Thrift days, where our flow control options were very limited and this 
was the best we could do to tell clients, "back off."  Now that we have our own 
protocol and full control over Netty we should simply not read more requests 
until we shed some load.  (Since shedding load is a gradual process--requests 
time out, we write hints, our load goes down--clients will just perceive this 
as slowing down, which is what we want.)
# The API should provide for reporting load to clients so they can do real load 
balancing across coordinators and not just round-robin.
# Throttling requests to the speed of the slowest replica is not something we 
should ship, even as an option.


> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths, Streaming and Messaging
>            Reporter: Ariel Weisberg
>            Assignee: Sergio Bossa
>         Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, 
> limit.btm, no_backpressure.png
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator

Reply via email to