[jira] [Commented] (HDFS-9661) Deadlock in DN.FsDatasetImpl between moveBlockAcrossStorage and createRbw

Kai Zheng (JIRA) Tue, 19 Jan 2016 01:45:08 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-9661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106502#comment-15106502
 ]


Kai Zheng commented on HDFS-9661:
---------------------------------

Good catch and nice report!

The patch can solve the deadlock issue. Not sure if any other similar case like 
this and how to prevent such deadlock cleanly.
Wonder if it's possible to consider a unified model for the lock here. For 
operations similar to {{FsDatasetImpl#moveBlockAcrossStorage}} and 
{{FsDatasetImpl#createRbw}}, they need to choose volume and obtain lock on 
{{RoundRobinVolumeChoosingPolicy}}, then need to lock on {{FsDatasetImpl}} in 
{{volume.getAvailable}}. So to avoid such deadlock situation, maybe in each 
thread, before the operation, avoid any lock on FsDatasetImpl object; during 
the operation, get lock on VolumeChoosingPolicy first.

> Deadlock in DN.FsDatasetImpl between moveBlockAcrossStorage and createRbw
> -------------------------------------------------------------------------
>
>                 Key: HDFS-9661
>                 URL: https://issues.apache.org/jira/browse/HDFS-9661
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.7.0, 2.8.0, 2.7.1, 2.7.2
>            Reporter: ade
>            Assignee: Vinayakumar B
>             Fix For: 2.7.2
>
>         Attachments: HDFS-9661.0.patch, hdfs-9661-jstack.gif.png
>
>
> We found a deadlock in dn.FsDatasetImpl between moveBlockAcrossStorage and 
> createRbw from rpc call: replaceBlock/writeBlock. The dn's jstack result is
> !hdfs-9661-jstack.gif.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9661) Deadlock in DN.FsDatasetImpl between moveBlockAcrossStorage and createRbw

Reply via email to