[
https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969862#comment-16969862
]
Chen Zhang commented on HDFS-12288:
-----------------------------------
Thanks [~elgoiri] for the comments, update the patch v6 according to comment
1&2.
{quote} * What's the deal with TestNamenodeCapacityReport?{quote}
Before the patch, every live DataNode will report at least 1 xceriver to
NameNode, but it's not actually the real xceiver, it's just the
\{{DataXceiverServer}} thread, which is added to \{{threadGroup}} during the
initialization of DataNode. After the patch, the xceiver count will be the
*real* number of data-transfer thread, so when the Datanode is idle, it will
report 0 xceiver count to NameNode.
{\{TestNamenodeCapacityReport}} assumes there would be at least 1 xceiver for
each DataNode, it leverage this to check the number of cluster's live node, so
we need to update the test with this patch.
> Fix DataNode's xceiver count calculation
> ----------------------------------------
>
> Key: HDFS-12288
> URL: https://issues.apache.org/jira/browse/HDFS-12288
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode, hdfs
> Reporter: Lukas Majercak
> Assignee: Chen Zhang
> Priority: Major
> Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch,
> HDFS-12288.003.patch, HDFS-12288.004.patch, HDFS-12288.005.patch,
> HDFS-12288.006.patch
>
>
> The problem with the ThreadGroup.activeCount() method is that the method is
> only a very rough estimate, and in reality returns the total number of
> threads in the thread group as opposed to the threads actually running.
> In some DNs, we saw this to return 50~ for a long time, even though the
> actual number of DataXceiver threads was next to none.
> This is a big issue as we use the xceiverCount to make decisions on the NN
> for choosing replication source DN or returning DNs to clients for R/W.
> The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value
> which only accounts for actual number of DataXcevier threads currently
> running and thus represents the load on the DN much better.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]