[
https://issues.apache.org/jira/browse/HDFS-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868884#comment-15868884
]
Yongjun Zhang commented on HDFS-11352:
--------------------------------------
HI [~xkrogen] and [~ajisakaa],
Thanks for your work here.
It looks to me that the reason trunk doesn't need this patch because it has
HDFS-7501. Because HDFS-7501 is not backported to 2.7.x and 2.6.x, we had the
need for HDFS-11352 here. Does that sound correct to you?
Thanks.
> Potential deadlock in NN when failing over
> ------------------------------------------
>
> Key: HDFS-11352
> URL: https://issues.apache.org/jira/browse/HDFS-11352
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.7.4, 2.6.6
> Reporter: Erik Krogen
> Assignee: Erik Krogen
> Priority: Critical
> Labels: high-availability
> Fix For: 2.7.4, 2.6.6
>
> Attachments: HDFS-11352-branch-2.7.000.patch
>
>
> HDFS-11180 fixed a general class of deadlock that can occur when failing over
> between the MetricsSystemImpl and FSEditLog (see comments on that JIRA for
> more details). In trunk and branch-2/branch-2.8 this fix was successful by
> making the metrics calls not synchronize on FSEditLog.
> In branch-2.6 and branch-2.7 there is one more method,
> {{FSNamesystem#getTransactionsSinceLastCheckpoint}}, which still requires the
> lock on FSEditLog and thus can result in the same deadlock scenario. This can
> be seen by running {{TestFSNamesystemMBean#testWithFSEditLogLock}} _with the
> patch in HDFS-11290_ on either of these branches (it fails currently).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]