[ https://issues.apache.org/jira/browse/CASSANDRA-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765781#comment-16765781 ]
Benedict commented on CASSANDRA-15013: -------------------------------------- There is a potential semantic change here that could have negative consequences for clients, so we need to be careful. Presently, clients can expect a response to every message they send to the server while they are connected; the server handles time outs, failures etc. and makes sure to respond to the client in some way (unless the connection is lost). If we change this, we would need to corroborate client behaviour - do they all gracefully timeout outstanding requests, for instance, or do they kill them on a per-connection basis? In either case, though, it is a shame to discard work that we've gone to the effort of performing. From a cluster stability perspective, this is more likely to put the cluster under steadily more pressure, rather than less, as the client will no doubt want to retry this work. It is better to either discard work that has yet to be initiated, or to try not to discard work at all and provide back pressure signals to the client. I think it would also be more relevant as an initial step to remove the blocking behaviour on incoming, so that the eventLoop can always service the outgoing queue to prevent this build up. There's a strong chance the build up of outgoing messages you see is down to the eventLoop that must process it being blocked on offering work to the {{requestExecutor}}, and by removing this block the outgoing queue will not accumulate so readily. There are two options if we do this: stop reading from incoming channels when the {{requestExecutor}} is full, or throw {{OverloadedException}}. In my opinion, this is exactly what TCP back pressure is for, but we also have a world where clients have been depending on the server trying its best to never push back, so they have inadequate queueing models internally, with no support for noticing or handling this back pressure. This to me is a design flaw that should be addressed in clients, but we could mitigate it for now by increasing the size of our {{requestExecutor}} queue (which is actually unnecessarily small), or even making it unbounded and simply tracking the total number of bytes we have read off the wire but not answered. Perhaps we could even make the behaviour of {{OverloadedException}} vs back pressure a connection-configurable option, so that clients with poor flow control can utilise {{OverloadedException}} to handle this, and those with better control can use normal TCP flow control mechanisms. What do you think? > Message Flusher queue can grow unbounded, potentially running JVM out of > memory > ------------------------------------------------------------------------------- > > Key: CASSANDRA-15013 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15013 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client > Reporter: Sumanth Pasupuleti > Assignee: Sumanth Pasupuleti > Priority: Major > Fix For: 4.0, 3.0.x, 3.11.x > > Attachments: heap dump showing each ImmediateFlusher taking upto > 600MB.png > > > This is a follow-up ticket out of CASSANDRA-14855, to make the Flusher queue > bounded, since, in the current state, items get added to the queue without > any checks on queue size, nor with any checks on netty outbound buffer to > check the isWritable state. > We are seeing this issue hit our production 3.0 clusters quite often. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org