[ 
https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969862#comment-16969862
 ] 

Chen Zhang commented on HDFS-12288:
-----------------------------------

Thanks [~elgoiri] for the comments, update the patch v6 according to comment 
1&2.
{quote} * What's the deal with TestNamenodeCapacityReport?{quote}
Before the patch, every live DataNode will report at least 1 xceriver to 
NameNode, but it's not actually the real xceiver, it's just the 
\{{DataXceiverServer}} thread, which is added to \{{threadGroup}} during the 
initialization of DataNode. After the patch, the xceiver count will be the 
*real* number of data-transfer thread, so when the Datanode is idle, it will 
report 0 xceiver count to NameNode.

{\{TestNamenodeCapacityReport}} assumes there would be at least 1 xceiver for 
each DataNode, it leverage this to check the number of cluster's live node, so 
we need to update the test with this patch.

> Fix DataNode's xceiver count calculation
> ----------------------------------------
>
>                 Key: HDFS-12288
>                 URL: https://issues.apache.org/jira/browse/HDFS-12288
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, hdfs
>            Reporter: Lukas Majercak
>            Assignee: Chen Zhang
>            Priority: Major
>         Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch, 
> HDFS-12288.003.patch, HDFS-12288.004.patch, HDFS-12288.005.patch, 
> HDFS-12288.006.patch
>
>
> The problem with the ThreadGroup.activeCount() method is that the method is 
> only a very rough estimate, and in reality returns the total number of 
> threads in the thread group as opposed to the threads actually running.
> In some DNs, we saw this to return 50~ for a long time, even though the 
> actual number of DataXceiver threads was next to none.
> This is a big issue as we use the xceiverCount to make decisions on the NN 
> for choosing replication source DN or returning DNs to clients for R/W.
> The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value 
> which only accounts for actual number of DataXcevier threads currently 
> running and thus represents the load on the DN much better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to