[
https://issues.apache.org/jira/browse/HDFS-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090748#comment-17090748
]
Toshihiko Uchida edited comment on HDFS-14353 at 4/23/20, 5:09 PM:
-------------------------------------------------------------------
[~ayushtkn] Thanks!
> what I remember the xmitsInProgress shouldn't go negative
Right.
> The <= change might not verify the said issue
Let me explain my suggestion, which would also be [~maobaolong]'s intention, in
more detail.
The waitFor line makes sure that DataNodes completed all EC reconstruction
tasks:
- Each DataNode runs the DataXceiverServer thread in the dataXceiverServer
threadGroup;
- The thread runs DataXceiver threads in the same threadGroup when it
receives/sends data;
- After all EC reconstruction tasks finish, the DataXceiverServer thread should
be the only thread belonging to the threadGroup (i.e., curDn.getXceiverCount()
== 1).
The reason for <= 1 is that the thread is not running on a dead DataNode, which
was shutdown to cause EC reconstruction.
Attached the patch to negate curDn.getXceiverCount() > 1.
Please kindly review.
was (Author: touchida):
[~ayushtkn] Thanks!
> what I remember the xmitsInProgress shouldn't go negative
Yes.
> The <= change might not verify the said issue
Let me explain my suggestion, which would also be [~maobaolong]'s intention, in
more detail.
The waitFor line makes sure that DataNodes completed all EC reconstruction
tasks:
- Each DataNode runs the DataXceiverServer thread in the dataXceiverServer
threadGroup;
- The thread runs DataXceiver threads in the same threadGroup when it
receives/sends data;
- After all EC reconstruction tasks finish, the DataXceiverServer thread should
be the only thread belonging to the threadGroup (i.e., curDn.getXceiverCount()
== 1).
The reason for <= 1 is that the thread is not running on a dead DataNode, which
was shutdown to cause EC reconstruction.
Attached the patch to negate curDn.getXceiverCount() > 1.
Please kindly review.
> Erasure Coding: metrics xmitsInProgress become to negative.
> -----------------------------------------------------------
>
> Key: HDFS-14353
> URL: https://issues.apache.org/jira/browse/HDFS-14353
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode, erasure-coding
> Affects Versions: 3.3.0
> Reporter: maobaolong
> Assignee: maobaolong
> Priority: Major
> Attachments: HDFS-14353.001.patch, HDFS-14353.002.patch,
> HDFS-14353.003.patch, HDFS-14353.004.patch, HDFS-14353.005.patch,
> HDFS-14353.006.patch, HDFS-14353.007.patch, HDFS-14353.008.patch,
> HDFS-14353.009.patch, HDFS-14353.010.patch, HDFS-14353.010.patch,
> screenshot-1.png
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]