[ 
https://issues.apache.org/jira/browse/HDFS-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745839#comment-14745839
 ] 

Ming Ma commented on HDFS-8898:
-------------------------------

Can we provide optimizations for specific use cases without sacrificing the 
overall security requirement Jason mentioned above?

* Allow super users to get a simplified version of ContentSummary quickly. To 
add to the Joep’s scenarios, we have scenarios which require quota and 
high-level usage “quota + namespace files/dirs count + disk consumption”, which 
is a subset of what ContentSummary provides(file length, distinction between 
files and dirs aren’t important). If we provide a ContentSummaryV2 with only 
these fields, NN can just return usage data cached on directory objects with 
quota set. For regular user or directories without quota set, traversal is 
still required.
* Support other users besides super users. The permissions check is already 
skipped if the caller is a super user. If we define something like power user 
which have less power than super user but have read access to all directories, 
we can apply the same optimization to these power users. At the minimal, proxy 
users that can represent everyone should be treated like super users for the 
getContentSummary scenario.

> Create API and command-line argument to get quota without need to get file 
> and directory counts
> -----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8898
>                 URL: https://issues.apache.org/jira/browse/HDFS-8898
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: fs
>            Reporter: Joep Rottinghuis
>
> On large directory structures it takes significant time to iterate through 
> the file and directory counts recursively to get a complete ContentSummary.
> When you want to just check for the quota on a higher level directory it 
> would be good to have an option to skip the file and directory counts.
> Moreover, currently one can only check the quota if you have access to all 
> the directories underneath. For example, if I have a large home directory 
> under /user/joep and I host some files for another user in a sub-directory, 
> the moment they create an unreadable sub-directory under my home I can no 
> longer check what my quota is. Understood that I cannot check the current 
> file counts unless I can iterate through all the usage, but for 
> administrative purposes it is nice to be able to get the current quota 
> setting on a directory without the need to iterate through and run into 
> permission issues on sub-directories.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to