[
https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123483#comment-16123483
]
Rushabh S Shah commented on HDFS-12288:
---------------------------------------
{noformat}
// the load for writers is 2 because both the write xceiver & packet
// responder threads are counted in the load
expectedTotalLoad += fileRepl;
expectedInServiceLoad += fileRepl;
{noformat}
This comment is there for a reason.
When we receive a block, it creates 2 thread. One is DataXceiver thread and
other is Packet Responder thread.
If we are using {{DataNodeMetrics#getDataNodeActiveXceiversCount}} as a
replacement for {{activeThreadCount}} then we need to add
{{PacketResponderThread}} to {{DataNodeMetrics#dataNodeActiveXceiversCount}}
otherwise we will create twice number of threads compared to today.
{noformat}
- return threadGroup == null ? 0 : threadGroup.activeCount();
+ return metrics == null ? 0 : metrics.getDataNodeActiveXceiversCount();
{noformat}
Need to check once more that is there a possibility that datanode can start
without initializing the metrics.
Looking at the code, I think its not possible but just need to make sure.
> Fix DataNode's xceiver count calculation
> ----------------------------------------
>
> Key: HDFS-12288
> URL: https://issues.apache.org/jira/browse/HDFS-12288
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode, hdfs
> Reporter: Lukas Majercak
> Assignee: Lukas Majercak
> Attachments: HDFS-12288.001.patch
>
>
> The problem with the ThreadGroup.activeCount() method is that the method is
> only a very rough estimate, and in reality returns the total number of
> threads in the thread group as opposed to the threads actually running.
> In some DNs, we saw this to return 50~ for a long time, even though the
> actual number of DataXceiver threads was next to none.
> This is a big issue as we use the xceiverCount to make decisions on the NN
> for choosing replication source DN or returning DNs to clients for R/W.
> The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value
> which only accounts for actual number of DataXcevier threads currently
> running and thus represents the load on the DN much better.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]