[
https://issues.apache.org/jira/browse/HDFS-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13554199#comment-13554199
]
Todd Lipcon commented on HDFS-4359:
-----------------------------------
Hi Liang Xie. I noticed that there isn't a deadlock on this node alone, but the
thread holding the lock is stuck in a 'versionRequest()' RPC. Any idea why this
RPC is taking a long time hearing back from the NN? See the thread "DataNode:
[file:/home/work/data1/hdfs/lgprc-xiaomi/datanode,file:/home/work/data2/hdfs/lgprc-xiaomi/datanode,file:/home/work/data3/hdfs/lgprc-xiaomi/datanode,file:/home/work/data4/hdfs/lgprc-xiaomi/datanode,file:/home/work/data5/hdfs/lgprc-xiaomi/datanode,file:/home/work/data6/hdfs/lgprc-xiaomi/datanode,file:/home/work/data7/hdfs/lgprc-xiaomi/datanode,file:/home/work/data8/hdfs/lgprc-xiaomi/datanode,file:/home/work/data9/hdfs/lgprc-xiaomi/datanode,file:/home/work/data10/hdfs/lgprc-xiaomi/datanode,file:/home/work/data11/hdfs/lgprc-xiaomi/datanode,file:/home/work/data12/hdfs/lgprc-xiaomi/datanode]
heartbeating to /10.2.201.14:11200" daemon prio=10 tid=0x00007fd34c8e4800
nid=0xa2d in Object.wait() [0x00007fd2db3e0000]"
> remove an unnecessary synchronized keyword in BPOfferService.java
> -----------------------------------------------------------------
>
> Key: HDFS-4359
> URL: https://issues.apache.org/jira/browse/HDFS-4359
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Affects Versions: 3.0.0, 2.0.2-alpha
> Reporter: liang xie
> Assignee: liang xie
> Attachments: dn.jstack, HDFS-4359.txt
>
>
> we encountered a NN&DN hung issue, the DN hung was caused by no NN response
> for heartbeat. Per DN thread dump, i think we can have a little improvement
> on this detail code :
> synchronized List<BPServiceActor> getBPServiceActors() {
> return Lists.newArrayList(bpServices);
> }
> the bpServices is declared as :
> private List<BPServiceActor> bpServices =
> new CopyOnWriteArrayList<BPServiceActor>();
> It's a thread-safe variant indead, so we can remove the above synchronized
> keyword safely, IMHO.
> Here is a simple statistic for thread dump:
> xieliang@xieliang:/tmp$ grep 0x00000007b00289f0 dn.jstack |wc -l
> 252
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira