[
https://issues.apache.org/jira/browse/HDFS-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741632#comment-14741632
]
Joep Rottinghuis commented on HDFS-8898:
----------------------------------------
So it sounds like we're discussing two things here:
1) Getting the quota itself for a directory that a user has access to. There
seems to be little security concerns with this.
2) Getting the quota, and the "ContentSummary" / count / usage for a directory
that a user has access to, even if they might not have access to all the
sub-directories. This is where [~jlowe] pointed out that there could be a
potential security implication.
Even with yielding the NN lock, it seems the NN can still lock for ~1 sec per
10M files in a sub-directory to check the entire sub-directory sub-directory
tree for permissions.
To address the potential security implications for 2) we could either make this
a cluster-wide (final) config value, or we could do something with an extended
attribute on the directory itself to allow or disallow a particular directory
to be traversed (or not).
1) would give a huge performance boost for the cases when people just want to
know what the quota is.
2) would give a huge performance boost for the cases when people want to know a
quota plus what's left for large directories relatively high in the directory
structure (let alone / on a huge namespace of many tens of millions of files).
> Create API and command-line argument to get quota without need to get file
> and directory counts
> -----------------------------------------------------------------------------------------------
>
> Key: HDFS-8898
> URL: https://issues.apache.org/jira/browse/HDFS-8898
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: fs
> Reporter: Joep Rottinghuis
>
> On large directory structures it takes significant time to iterate through
> the file and directory counts recursively to get a complete ContentSummary.
> When you want to just check for the quota on a higher level directory it
> would be good to have an option to skip the file and directory counts.
> Moreover, currently one can only check the quota if you have access to all
> the directories underneath. For example, if I have a large home directory
> under /user/joep and I host some files for another user in a sub-directory,
> the moment they create an unreadable sub-directory under my home I can no
> longer check what my quota is. Understood that I cannot check the current
> file counts unless I can iterate through all the usage, but for
> administrative purposes it is nice to be able to get the current quota
> setting on a directory without the need to iterate through and run into
> permission issues on sub-directories.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)