[ 
https://issues.apache.org/jira/browse/HDFS-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14212548#comment-14212548
 ] 

Chris Nauroth commented on HDFS-7384:
-------------------------------------

Hi, [~vinayrpet].  The current behavior of {{getAclStatus}} is an intentional 
design choice, but the history behind that choice is a bit convoluted.  Let me 
see if I can reconstruct it here.

It starts with HADOOP-10220, which added an ACL indicator bit to 
{{FsPermission}}.  This was provided as an optimization so that clients could 
quickly identify if a file has an ACL, without needing an additional RPC.

Later, objections were raised against the ACL bit in HDFS-5923 and HDFS-5932.  
We made a decision to roll back the HADOOP-10220 changes, and instead require 
callers to use {{getAclStatus}} to identify the presence of an ACL.  Prior to 
this, early implementations of {{getAclStatus}} would always return a non-empty 
list.  For an inode with no ACL, it would return the "minimal ACL" containing 
the 3 entries that correspond to basic POSIX permissions.  However, at this 
point, it became helpful to change {{getAclStatus}} so that it would return an 
empty list if there is no ACL.  This was seen as easier for clients than trying 
to check the entries for no ACL/minimal ACL.  It was also seen as a cleaner 
logical separation, since the client likely already has the {{FsPermission}} 
prior to calling {{getAclStatus}}, and therefore it would not be helpful to 
return redundant ACL entries.

Finally, HDFS-6326 identified that our implementation choice was 
backwards-incompatible for webhdfs, and generally a performance bottleneck for 
shell users.  To solve this, we reinstated the ACL bit, in a slightly different 
implementation, but the behavior of {{getAclStatus}} remained the same.

You've definitely identified a weakness in the current API design, and I raised 
similar objections at the time.  It's a trade-off.  I think there is good 
logical separation right now, but as a side effect, it does mean that callers 
may need some extra client-side logic to piece all of the information together, 
such as if someone wanted to write a custom GUI consuming WebHDFS to display 
ACL information.

At this point, we can't change the behavior of {{getAclStatus}} on the 2.x line 
for compatibility reasons.  Suppose a 2.6.0 deployment of the shell called 
{{getAclStatus}} on a 2.7.0 NameNode, and it had been changed to return the 
complete ACL.  This would cause {{getfacl}} to display duplicate entries, 
because the 2.6.0 logic of {{GetfaclCommand}} and 
{{AclUtil#getAclFromPermAndEntries}} will combine the output of 
{{getAclStatus}} with the {{FsPermission}}, resulting in 3 duplicate entries.

Where does that leave us for this jira?  I can see the following options:
# Resolve as won't fix, based on the above rationale.
# Target 3.0 for a backwards-incompatible change.
# Add a new RPC, named {{getFullAcl}} or similar, with the behavior that you 
proposed.  However, I'd prefer not to increase the API footprint unless there 
is a really strong use case.

Hope this helps.  Let me know your thoughts.  Thanks!

> 'getfacl' command and 'getAclStatus' output should be in sync
> -------------------------------------------------------------
>
>                 Key: HDFS-7384
>                 URL: https://issues.apache.org/jira/browse/HDFS-7384
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Vinayakumar B
>            Assignee: Vinayakumar B
>
> *getfacl* command will print all the entries including basic and extended 
> entries, mask entries and effective permissions.
> But, *getAclStatus* FileSystem API will return only extended ACL entries set 
> by the user. But this will not include the mask entry as well as effective 
> permissions.
> To benefit the client using API, better to include 'mask' entry and effective 
> permissions in the return list of entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to