[
https://issues.apache.org/jira/browse/HDFS-7384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14212548#comment-14212548
]
Chris Nauroth commented on HDFS-7384:
-------------------------------------
Hi, [~vinayrpet]. The current behavior of {{getAclStatus}} is an intentional
design choice, but the history behind that choice is a bit convoluted. Let me
see if I can reconstruct it here.
It starts with HADOOP-10220, which added an ACL indicator bit to
{{FsPermission}}. This was provided as an optimization so that clients could
quickly identify if a file has an ACL, without needing an additional RPC.
Later, objections were raised against the ACL bit in HDFS-5923 and HDFS-5932.
We made a decision to roll back the HADOOP-10220 changes, and instead require
callers to use {{getAclStatus}} to identify the presence of an ACL. Prior to
this, early implementations of {{getAclStatus}} would always return a non-empty
list. For an inode with no ACL, it would return the "minimal ACL" containing
the 3 entries that correspond to basic POSIX permissions. However, at this
point, it became helpful to change {{getAclStatus}} so that it would return an
empty list if there is no ACL. This was seen as easier for clients than trying
to check the entries for no ACL/minimal ACL. It was also seen as a cleaner
logical separation, since the client likely already has the {{FsPermission}}
prior to calling {{getAclStatus}}, and therefore it would not be helpful to
return redundant ACL entries.
Finally, HDFS-6326 identified that our implementation choice was
backwards-incompatible for webhdfs, and generally a performance bottleneck for
shell users. To solve this, we reinstated the ACL bit, in a slightly different
implementation, but the behavior of {{getAclStatus}} remained the same.
You've definitely identified a weakness in the current API design, and I raised
similar objections at the time. It's a trade-off. I think there is good
logical separation right now, but as a side effect, it does mean that callers
may need some extra client-side logic to piece all of the information together,
such as if someone wanted to write a custom GUI consuming WebHDFS to display
ACL information.
At this point, we can't change the behavior of {{getAclStatus}} on the 2.x line
for compatibility reasons. Suppose a 2.6.0 deployment of the shell called
{{getAclStatus}} on a 2.7.0 NameNode, and it had been changed to return the
complete ACL. This would cause {{getfacl}} to display duplicate entries,
because the 2.6.0 logic of {{GetfaclCommand}} and
{{AclUtil#getAclFromPermAndEntries}} will combine the output of
{{getAclStatus}} with the {{FsPermission}}, resulting in 3 duplicate entries.
Where does that leave us for this jira? I can see the following options:
# Resolve as won't fix, based on the above rationale.
# Target 3.0 for a backwards-incompatible change.
# Add a new RPC, named {{getFullAcl}} or similar, with the behavior that you
proposed. However, I'd prefer not to increase the API footprint unless there
is a really strong use case.
Hope this helps. Let me know your thoughts. Thanks!
> 'getfacl' command and 'getAclStatus' output should be in sync
> -------------------------------------------------------------
>
> Key: HDFS-7384
> URL: https://issues.apache.org/jira/browse/HDFS-7384
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Vinayakumar B
> Assignee: Vinayakumar B
>
> *getfacl* command will print all the entries including basic and extended
> entries, mask entries and effective permissions.
> But, *getAclStatus* FileSystem API will return only extended ACL entries set
> by the user. But this will not include the mask entry as well as effective
> permissions.
> To benefit the client using API, better to include 'mask' entry and effective
> permissions in the return list of entries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)