[
https://issues.apache.org/jira/browse/HDFS-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127661#comment-17127661
]
hemanthboyina commented on HDFS-15372:
--------------------------------------
thanks for the very clear explanation [~sodonnell]
> Files in snapshots no longer see attribute provider permissions
> ---------------------------------------------------------------
>
> Key: HDFS-15372
> URL: https://issues.apache.org/jira/browse/HDFS-15372
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
> Attachments: HDFS-15372.001.patch
>
>
> Given a cluster with an authorization provider configured (eg Sentry) and the
> paths covered by the provider are snapshotable, there was a change in
> behaviour in how the provider permissions and ACLs are applied to files in
> snapshots between the 2.x branch and Hadoop 3.0.
> Eg, if we have the snapshotable path /data, which is Sentry managed. The ACLs
> below are provided by Sentry:
> {code}
> hadoop fs -getfacl -R /data
> # file: /data
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::---
> group:flume:rwx
> user:hive:rwx
> group:hive:rwx
> group:testgroup:rwx
> mask::rwx
> other::--x
> /data/tab1
> {code}
> After taking a snapshot, the files in the snapshot do not see the provider
> permissions:
> {code}
> hadoop fs -getfacl -R /data/.snapshot
> # file: /data/.snapshot
> # owner:
> # group:
> user::rwx
> group::rwx
> other::rwx
> # file: /data/.snapshot/snap1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/.snapshot/snap1/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> {code}
> However pre-Hadoop 3.0 (when the attribute provider etc was extensively
> refactored) snapshots did get the provider permissions.
> The reason is this code in FSDirectory.java which ultimately calls the
> attribute provider and passes the path we want permissions for:
> {code}
> INodeAttributes getAttributes(INodesInPath iip)
> throws IOException {
> INode node = FSDirectory.resolveLastINode(iip);
> int snapshot = iip.getPathSnapshotId();
> INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot);
> UserGroupInformation ugi = NameNode.getRemoteUser();
> INodeAttributeProvider ap = this.getUserFilteredAttributeProvider(ugi);
> if (ap != null) {
> // permission checking sends the full components array including the
> // first empty component for the root. however file status
> // related calls are expected to strip out the root component according
> // to TestINodeAttributeProvider.
> byte[][] components = iip.getPathComponents();
> components = Arrays.copyOfRange(components, 1, components.length);
> nodeAttrs = ap.getAttributes(components, nodeAttrs);
> }
> return nodeAttrs;
> }
> {code}
> The line:
> {code}
> INode node = FSDirectory.resolveLastINode(iip);
> {code}
> Picks the last resolved Inode and if you then call node.getPathComponents,
> for a path like '/data/.snapshot/snap1/tab1' it will return /data/tab1. It
> resolves the snapshot path to its original location, but its still the
> snapshot inode.
> However the logic passes 'iip.getPathComponents' which returns
> "/user/.snapshot/snap1/tab" to the provider.
> The pre Hadoop 3.0 code passes the inode directly to the provider, and hence
> it only ever sees the path as "/user/data/tab1".
> It is debatable which path should be passed to the provider -
> /user/.snapshot/snap1/tab or /data/tab1 in the case of snapshots. However as
> the behaviour has changed I feel we should ensure the old behaviour is
> retained.
> It would also be fairly easy to provide a config switch so the provider gets
> the full snapshot path or the resolved path.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]