[ 
https://issues.apache.org/jira/browse/HDDS-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-4404:
------------------------------------
    Description: 
>From [~nanda619]'s analysis.

ContainerReportPublisher thread runs periodically (default interval 60s) in 
Datanode and adds ContainerReport to StateContext (Queue).
Heartbeat thread runs periodically (default interval 30s), picks up the 
ContainerReport (if any) from StateContext.
For short time, the ContainerReport will be held in Datanode StateContext.
For Recon, a change was made in datanode such that the ContainerReport will be 
cached in Datanode StateContext separately for each endpoint (i.e. SCM and 
Recon). As I see, if Recon is configured in the Datanode and all the reports 
that are to be sent to Recon will be pending in the StateContextQueue 
(LinkedList)

> Datanode can go OOM when a Recon or SCM Server is very slow in processing 
> reports.
> ----------------------------------------------------------------------------------
>
>                 Key: HDDS-4404
>                 URL: https://issues.apache.org/jira/browse/HDDS-4404
>             Project: Hadoop Distributed Data Store
>          Issue Type: Task
>          Components: Ozone Datanode
>    Affects Versions: 1.0.0
>            Reporter: Aravindan Vijayan
>            Priority: Critical
>         Attachments: Screen Shot 2020-10-26 at 11.24.09 PM.png
>
>
> From [~nanda619]'s analysis.
> ContainerReportPublisher thread runs periodically (default interval 60s) in 
> Datanode and adds ContainerReport to StateContext (Queue).
> Heartbeat thread runs periodically (default interval 30s), picks up the 
> ContainerReport (if any) from StateContext.
> For short time, the ContainerReport will be held in Datanode StateContext.
> For Recon, a change was made in datanode such that the ContainerReport will 
> be cached in Datanode StateContext separately for each endpoint (i.e. SCM and 
> Recon). As I see, if Recon is configured in the Datanode and all the reports 
> that are to be sent to Recon will be pending in the StateContextQueue 
> (LinkedList)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to