[
https://issues.apache.org/jira/browse/CASSANDRA-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16769331#comment-16769331
]
Jason Brown edited comment on CASSANDRA-15013 at 2/15/19 1:54 PM:
------------------------------------------------------------------
Yup, I agree the harder part, programming wise, is {{requestExecutor}} stuffs,
and let's plow through that first. The {{OptionsMessage/client protocol work}}
is significantly easier, as I think we agree, but would that qualify as a
change to the native protocol, for which we need to wait for a major rev (as
in, 4.0)? Or are additive additions ok acceptable for previous native protocol
versions? We might have a policy or general advice around this, but I don't
know.
Either way, [~sumanth.pasupuleti] has enough to work forward for now, and we
can figure out the native protocol-impacting stuffs in parallel.
was (Author: jasobrown):
Yup, I agree the harder part, programming wise, is {{requestExecutor}} stuffs,
and let's plow through that first. The {{OptionsMessage/client protocol work}}
is significantly easier, as I think we agree, but would that qualify as a
change to the native protocol, for which we need to wait for a major rev (as
in, 4.0)? Or are additive additions ok acceptable for previous native protocol
versions? We might have a policy or general advice around this, but I don't
know.
> Message Flusher queue can grow unbounded, potentially running JVM out of
> memory
> -------------------------------------------------------------------------------
>
> Key: CASSANDRA-15013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15013
> Project: Cassandra
> Issue Type: Bug
> Components: Messaging/Client
> Reporter: Sumanth Pasupuleti
> Assignee: Sumanth Pasupuleti
> Priority: Major
> Fix For: 4.0, 3.0.x, 3.11.x
>
> Attachments: BlockedEpollEventLoopFromHeapDump.png,
> BlockedEpollEventLoopFromThreadDump.png, RequestExecutorQueueFull.png, heap
> dump showing each ImmediateFlusher taking upto 600MB.png
>
>
> This is a follow-up ticket out of CASSANDRA-14855, to make the Flusher queue
> bounded, since, in the current state, items get added to the queue without
> any checks on queue size, nor with any checks on netty outbound buffer to
> check the isWritable state.
> We are seeing this issue hit our production 3.0 clusters quite often.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]