[ 
https://issues.apache.org/jira/browse/HDDS-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874206#comment-17874206
 ] 

JiangHua Zhu commented on HDDS-11331:
-------------------------------------

There are two ways to try to solve this problem:
1. Weaken the synchronized in StateContext#pipelineActions.
2. Improve the heartbeat reporting mechanism.
Do you have any suggestions? [~sammichen] [~szetszwo] [~adoroszlai].

> Datanode cannot report for a long time
> --------------------------------------
>
>                 Key: HDDS-11331
>                 URL: https://issues.apache.org/jira/browse/HDDS-11331
>             Project: Apache Ozone
>          Issue Type: Improvement
>          Components: DN
>    Affects Versions: 1.4.0
>            Reporter: JiangHua Zhu
>            Assignee: JiangHua Zhu
>            Priority: Major
>         Attachments: 1505js.1
>
>
> This is an example of an online cluster.
> SCM shows that some Datanodes cannot report for a long time, and their status 
> is DEAD or STALE.
> I printed jstack information, which shows that StateContext#pipelineActions 
> is stuck and cannot report to SCM/Recon.
> The jstack information has been uploaded as an attachment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to