[ 
https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123794#comment-16123794
 ] 

Hanisha Koneru commented on HDFS-12288:
---------------------------------------

Thanks [~shahrs87] for the review.
[~lukmajercak], {{DataNode#threadGroup}} doesn't account for {{DataXceiver}} 
threads alone. It has the following daemons as well
- {{DataXceiverServer}}
- {{BlockRecoveryWorker#recoverBlocks()}}
- {{BlockReceiver#PacketResponder}}

So if we change the {{DataNodeMetrics#getDataNodeActiveXceiversCount}} to 
reflect only the DataXceiver thread count, we lose out on the count for other 
active threads in the thread group. 
I would say we need to have 3 metrics to capture all the thread counts:
- {{dataNodeActiveXceiversCount}} for active DataXceiver threads
- {{dataNodePacketResponderCount}} for active PacketResponder threads
- {{dataNodeActiveThreadCount}} for all the active threads in the datanode.

 [~shahrs87], please correct me if I am wrong.

> Fix DataNode's xceiver count calculation
> ----------------------------------------
>
>                 Key: HDFS-12288
>                 URL: https://issues.apache.org/jira/browse/HDFS-12288
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, hdfs
>            Reporter: Lukas Majercak
>            Assignee: Lukas Majercak
>         Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch
>
>
> The problem with the ThreadGroup.activeCount() method is that the method is 
> only a very rough estimate, and in reality returns the total number of 
> threads in the thread group as opposed to the threads actually running.
> In some DNs, we saw this to return 50~ for a long time, even though the 
> actual number of DataXceiver threads was next to none.
> This is a big issue as we use the xceiverCount to make decisions on the NN 
> for choosing replication source DN or returning DNs to clients for R/W.
> The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value 
> which only accounts for actual number of DataXcevier threads currently 
> running and thus represents the load on the DN much better.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to