[ 
https://issues.apache.org/jira/browse/RATIS-726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956244#comment-16956244
 ] 

Tsz-wo Sze commented on RATIS-726:
----------------------------------

Good catch on the bug.  TimeoutScheduler is for scheduling timeout retries and 
has no knowledge about the requests.  We should fix the code using it.  Thanks 
a lot!

> TimeoutScheduler holds on to the raftClientRequest till it times out even 
> though request succeeds
> -------------------------------------------------------------------------------------------------
>
>                 Key: RATIS-726
>                 URL: https://issues.apache.org/jira/browse/RATIS-726
>             Project: Ratis
>          Issue Type: Bug
>          Components: client
>            Reporter: Shashikant Banerjee
>            Assignee: Tsz-wo Sze
>            Priority: Major
>
> While running freon with 1 Node ratis, it was observed that the 
> TimeoutScheduler holds on to the raftClientObject atleast for 3s(default for 
> requestTimeoutDuration) even though the request is processed successfully and 
> acknowledged back. This ends up creating a memory pressure causing ozone 
> client to go OOM .
>  Heapdump analysis of HDDS-2331 , it seems the timeout schduler holding onto 
> total of 176 requests, (88 of writeChunk containing actual data and 88 
> putBlock requests) although data write is happening sequentially key by key 
> in ozone.
> Thanks [~adoroszlai] for helping out discovering this.
> cc ~ [~ljain] [~msingh] [~szetszwo] [~jnpandey]
> Similar fix may be required in GrpCLogAppender as well it uses the same 
> TimeoutScheduler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to