[jira] [Comment Edited] (CASSANDRA-15013) Message Flusher queue can grow unbounded, potentially running JVM out of memory

Sumanth Pasupuleti (JIRA) Thu, 16 May 2019 17:55:23 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841835#comment-16841835
 ]


Sumanth Pasupuleti edited comment on CASSANDRA-15013 at 5/17/19 12:54 AM:
--------------------------------------------------------------------------

Incorporated the feedback from your branch (naming and TODOs) and from the jira 
comments.
Here is the updated change: 
https://github.com/sumanth-pasupuleti/cassandra/commit/45e31829e839d7e74b08566d7e501a46ed818330.

A couple of major changes
* Dispatcher would never query the map for getting EndpointPayloadTracker, 
rather it uses the reference it already has.
* FlushItem gets a reference to the corresponding Dispatcher, so it calls 
releaseItem on the right Dispatcher.
* I implemented tryRef and release that manage refCount on 
EndpointPayloadTracker, which "should" be thread safe


All UTs and DTests pass.
https://circleci.com/workflow-run/bb6b2eb6-daa6-41c1-9a3d-44b53bc7fb50



was (Author: sumanth.pasupuleti):
Incorporated the feedback from your branch (naming and TODOs) and from the jira 
comments.
Here is the updated change: 
https://github.com/sumanth-pasupuleti/cassandra/commit/45e31829e839d7e74b08566d7e501a46ed818330.

A couple of major changes
* Dispatcher would never query the map for getting EndpointPayloadTracker, 
rather it uses the reference it already has.
* FlushItem gets a reference to the corresponding Dispatcher, so it calls 
releaseItem on the right Dispatcher.

All UTs and DTests pass.
https://circleci.com/workflow-run/bb6b2eb6-daa6-41c1-9a3d-44b53bc7fb50


> Message Flusher queue can grow unbounded, potentially running JVM out of 
> memory
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15013
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15013
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Messaging/Client
>            Reporter: Sumanth Pasupuleti
>            Assignee: Sumanth Pasupuleti
>            Priority: Normal
>              Labels: pull-request-available
>             Fix For: 4.0, 3.0.x, 3.11.x
>
>         Attachments: BlockedEpollEventLoopFromHeapDump.png, 
> BlockedEpollEventLoopFromThreadDump.png, RequestExecutorQueueFull.png, heap 
> dump showing each ImmediateFlusher taking upto 600MB.png
>
>
> This is a follow-up ticket out of CASSANDRA-14855, to make the Flusher queue 
> bounded, since, in the current state, items get added to the queue without 
> any checks on queue size, nor with any checks on netty outbound buffer to 
> check the isWritable state.
> We are seeing this issue hit our production 3.0 clusters quite often.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (CASSANDRA-15013) Message Flusher queue can grow unbounded, potentially running JVM out of memory

Reply via email to