[
https://issues.apache.org/jira/browse/HDFS-13136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaoyu Yao updated HDFS-13136:
------------------------------
Status: Patch Available (was: Open)
> Avoid taking FSN lock while doing group member lookup for FSD permission check
> ------------------------------------------------------------------------------
>
> Key: HDFS-13136
> URL: https://issues.apache.org/jira/browse/HDFS-13136
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Reporter: Xiaoyu Yao
> Assignee: Xiaoyu Yao
> Priority: Major
> Attachments: HDFS-13136.001.patch
>
>
> Namenode has FSN lock and FSD lock. Most of the namenode operations need to
> take FSN lock first and then FSD lock. The permission check is done via
> FSPermissionChecker at FSD layer assuming FSN lock is taken.
> The FSPermissionChecker constructor invokes callerUgi.getGroups() that can
> take seconds sometimes. There are external cache scheme such SSSD and
> internal cache scheme for group lookup. However, the delay could still occur
> during cache refresh, which causes severe FSN lock contentions and
> unresponsive namenode issues.
> Checking the current code, we found that getBlockLocations(..) did it right
> but some methods such as getFileInfo(..), getContentSummary(..) did it wrong.
> This ticket is open to ensure the group lookup for permission checker is
> outside the FSN lock.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]