[
https://issues.apache.org/jira/browse/HDDS-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874206#comment-17874206
]
JiangHua Zhu commented on HDDS-11331:
-------------------------------------
There are two ways to try to solve this problem:
1. Weaken the synchronized in StateContext#pipelineActions.
2. Improve the heartbeat reporting mechanism.
Do you have any suggestions? [~sammichen] [~szetszwo] [~adoroszlai].
> Datanode cannot report for a long time
> --------------------------------------
>
> Key: HDDS-11331
> URL: https://issues.apache.org/jira/browse/HDDS-11331
> Project: Apache Ozone
> Issue Type: Improvement
> Components: DN
> Affects Versions: 1.4.0
> Reporter: JiangHua Zhu
> Assignee: JiangHua Zhu
> Priority: Major
> Attachments: 1505js.1
>
>
> This is an example of an online cluster.
> SCM shows that some Datanodes cannot report for a long time, and their status
> is DEAD or STALE.
> I printed jstack information, which shows that StateContext#pipelineActions
> is stuck and cannot report to SCM/Recon.
> The jstack information has been uploaded as an attachment.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]