[
https://issues.apache.org/jira/browse/HDFS-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127412#comment-17127412
]
hemanthboyina commented on HDFS-15372:
--------------------------------------
thanks for good analysis [~sodonnell]
{quote}With the 001 patch in place, if you try to list
/data/.snapshot/snapshot_1, the path seen by the attribute provider is:
/user/snapshot_1
Before, it was:
/user/.snapshot/snapshot1
When checking a path like /data/.snapshot/snap1 the provider will see
/data/snap1, but on the branch-2, it would have seen /data/.snapshot/snap1.
{quote}
is the path seen by the attribute provider for branch and trunk was same ? it
was bit confusing , can you add all in one comment with an example for a
snapshot path
If we try list for a path , the path will be resolved as Inodes from
InodeInPath , and the same inodes components will be used by the provider right
? and INodesInPath handles .snapshot part of a path
While creating a snapshot we add the inode directory as the root to snapshot
{code:java}
DirectorySnapshottableFeature#createSnaphot
public Snapshot addSnapshot(INodeDirectory snapshotRoot, int id, String name,
final Snapshot s = new Snapshot(id, name, snapshotRoot); {code}
While getting inodesInPath for a file in snapshot we use the root of snapshot
to get the file , IMO that means the if the file has an acl the file under
snapshot root should have acl
{code:java}
if (isDotSnapshotDir(childName) && dir.isSnapshottable()) {
final Snapshot s = dir.getSnapshot(components[count + 1]);
else {
curNode = s.getRoot();
snapshotId = s.getId();
} {code}
please correct me if am missing some thing here
> Files in snapshots no longer see attribute provider permissions
> ---------------------------------------------------------------
>
> Key: HDFS-15372
> URL: https://issues.apache.org/jira/browse/HDFS-15372
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
> Attachments: HDFS-15372.001.patch
>
>
> Given a cluster with an authorization provider configured (eg Sentry) and the
> paths covered by the provider are snapshotable, there was a change in
> behaviour in how the provider permissions and ACLs are applied to files in
> snapshots between the 2.x branch and Hadoop 3.0.
> Eg, if we have the snapshotable path /data, which is Sentry managed. The ACLs
> below are provided by Sentry:
> {code}
> hadoop fs -getfacl -R /data
> # file: /data
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::---
> group:flume:rwx
> user:hive:rwx
> group:hive:rwx
> group:testgroup:rwx
> mask::rwx
> other::--x
> /data/tab1
> {code}
> After taking a snapshot, the files in the snapshot do not see the provider
> permissions:
> {code}
> hadoop fs -getfacl -R /data/.snapshot
> # file: /data/.snapshot
> # owner:
> # group:
> user::rwx
> group::rwx
> other::rwx
> # file: /data/.snapshot/snap1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> # file: /data/.snapshot/snap1/tab1
> # owner: hive
> # group: hive
> user::rwx
> group::rwx
> other::--x
> {code}
> However pre-Hadoop 3.0 (when the attribute provider etc was extensively
> refactored) snapshots did get the provider permissions.
> The reason is this code in FSDirectory.java which ultimately calls the
> attribute provider and passes the path we want permissions for:
> {code}
> INodeAttributes getAttributes(INodesInPath iip)
> throws IOException {
> INode node = FSDirectory.resolveLastINode(iip);
> int snapshot = iip.getPathSnapshotId();
> INodeAttributes nodeAttrs = node.getSnapshotINode(snapshot);
> UserGroupInformation ugi = NameNode.getRemoteUser();
> INodeAttributeProvider ap = this.getUserFilteredAttributeProvider(ugi);
> if (ap != null) {
> // permission checking sends the full components array including the
> // first empty component for the root. however file status
> // related calls are expected to strip out the root component according
> // to TestINodeAttributeProvider.
> byte[][] components = iip.getPathComponents();
> components = Arrays.copyOfRange(components, 1, components.length);
> nodeAttrs = ap.getAttributes(components, nodeAttrs);
> }
> return nodeAttrs;
> }
> {code}
> The line:
> {code}
> INode node = FSDirectory.resolveLastINode(iip);
> {code}
> Picks the last resolved Inode and if you then call node.getPathComponents,
> for a path like '/data/.snapshot/snap1/tab1' it will return /data/tab1. It
> resolves the snapshot path to its original location, but its still the
> snapshot inode.
> However the logic passes 'iip.getPathComponents' which returns
> "/user/.snapshot/snap1/tab" to the provider.
> The pre Hadoop 3.0 code passes the inode directly to the provider, and hence
> it only ever sees the path as "/user/data/tab1".
> It is debatable which path should be passed to the provider -
> /user/.snapshot/snap1/tab or /data/tab1 in the case of snapshots. However as
> the behaviour has changed I feel we should ensure the old behaviour is
> retained.
> It would also be fairly easy to provide a config switch so the provider gets
> the full snapshot path or the resolved path.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]