[ 
https://issues.apache.org/jira/browse/CASSANDRA-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803613#action_12803613
 ] 

Jonathan Ellis commented on CASSANDRA-685:
------------------------------------------

We either need to (a) have a separate deserialization queue for "reply" traffic 
(we could use one of the "header" bits that isn't part of the Message proper to 
control this), or (b) drop messages for overloaded states on the floor so the 
deserializer doesn't overload, or (c) we need to give up the command/reply 
division entirely.

Alternatively, option (b) reminds me that instead of "backpressure" we could 
just "timeoutpressure," where instead of overloaded stages backpressuring 
message deserializer backpressuring socket reads, the deserializer can just 
discard messages the system is too busy to handle.  The downside is, it will 
take an extra rpc_timeout latency before the clients start to get timeouts.  
The upside is, as things unclog the messages that get processed will be fresh 
ones, so we are less likely to waste work processing messages that the client 
isn't even waiting for anymore.

Also, I'd like to dynamically adjust stage capacity based on the amount of work 
that gets processed, rather than have a fixed value that has to be manually 
tuned.  Not sure what that would look like -- none of the Java BlockingQueue 
classes have adjustable capacity post-construction.  But, stage enqueueing is 
only done in one place (by the deserializer executor) so we can one-off 
something if we have to.

> add backpressure to StorageProxy
> --------------------------------
>
>                 Key: CASSANDRA-685
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-685
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 
> 0001-impose-stage-queue-limit-of-2048-operations-which-shou.txt, 
> 0002-make-TcpConnection.write-throw-WriteEnqueueException-i.txt
>
>
> Now that we have CASSANDRA-401 and CASSANDRA-488 there is one last piece: we 
> need to stop the target node from pulling mutations out of MessagingService 
> as fast as it can only to take up space in the mutation queue and eventually 
> fill up memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to