[ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg resolved CASSANDRA-9318.
---------------------------------------
    Resolution: Won't Fix

This ticket was specifically scoped to an implementation strategy that isn't 
going to solve the issue of clients being able to submit more work than a 
cluster can handle resulting in timeouts and nodes appearing unresponsive 
because they can't do the work in time. We can stop the server from running out 
of memory and crashing, but we can't stop the client from submitting more 
requests then the server can handle because we need nodes to effectively 
operate as write buffers for slow nodes to maintain availability.

At this point I am kind of with [Jonathan 
Shook|https://issues.apache.org/jira/browse/CASSANDRA-9318?focusedCommentId=14536846&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14536846]
 that shedding load (and writing hints) inside the DB is less useful for 
dealing with overload. I think it is useful for dealing with temporarily slow 
ranges on the hash ring and it's part of the overall nodes as write buffers 
strategy C* uses to maintain availability.

I found some ways to OOM the server (CASSANDRA-10971 and CASSANDRA-10972) and 
have patches out for those.

The # of in flight requests already has bounds depending on the bottleneck that 
prevent the server from crashing so adding an explicit one isn't useful right 
now. When TPC is implemented we will have to implement a bound since there is 
no thread pool to exhaust, but that is later work.

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths, Streaming and Messaging
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 2.1.x, 2.2.x
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster 
> by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding 
> bytes and requests and if it reaches a high watermark disable read on client 
> connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't 
> introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to