[ 
https://issues.apache.org/jira/browse/SENTRY-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16256443#comment-16256443
 ] 

Na Li commented on SENTRY-1964:
-------------------------------

The problem is to find the authObj from path.

For example, with the code change to not sending partition to HDFS, initially, 
the INode hierarchy of the table path "tmp/external/tables/ext2_before" is 
below, and the authObj of the partition "tmp/external/tables/ext2_before/i=1" 
or  "tmp/external/tables/ext2_before/i=1/stuff.txt" can use its parent's 
authObj, which is default.ext2. So sentry ACL can be found using the authObj.

1) path name=tmp,  type=DIR,  authObj=null
2) path name=external,  type=PREFIX,  authObj=null
3) path name=tables,  type=DIR,  authObj=null
4) path name=ext2_before,  type=AUTHZ_OBJECT,  authObj=default.ext2

after "alter table ext2 set location 
\'hdfs:///tmp/external/tables/ext2_after\'", the INode hierarchy becomes
1) path name=tmp,  type=DIR,  authObj=null
2) path name=external,  type=PREFIX,  authObj=null
3) path name=tables,  type=DIR,  authObj=null
4) path name=ext2_after,  type=AUTHZ_OBJECT,  authObj=default.ext2

When finding authObj for the partition "tmp/external/tables/ext2_before/i=1" or 
 "tmp/external/tables/ext2_before/i=1/stuff.txt", its parent is "tables", and 
its authObj is null. So the SentryINodeAttributes won't be used, and no Sentry 
ACL

In public List<AclEntry> getAclEntries(String[] pathElements) at 
SentryAuthorizationInfo, the authObjs are found in path first at 
"{color:red}Set<String> authzObjs = 
authzPaths.findAuthzObject(pathElements);{color}", then use authObjs to find 
the acl. If authObjs is null, then no sentry acl.

{code}
  public List<AclEntry> getAclEntries(String[] pathElements) {
    lock.readLock().lock();
    try {
      Set<String> authzObjs = authzPaths.findAuthzObject(pathElements);
      // Apparently setFAcl throws error if 'group::---' is not present
      AclEntry noGroup = AclEntry.parseAclEntry("group::---", true);

      Set<AclEntry> retSet = new HashSet<>();
      retSet.add(noGroup);

      if (authzObjs == null) {
        retSet.addAll(Collections.<AclEntry>emptyList());
        return new ArrayList<>(retSet);
      }

      // No duplicate acls should be added.
      for (String authzObj: authzObjs) {
        retSet.addAll(authzPermissions.getAcls(authzObj));
      }

      return new ArrayList<>(retSet);
    } finally {
      lock.readLock().unlock();
    }
  }
{code}



> HDFS sync does not need partition locations (usually)
> -----------------------------------------------------
>
>                 Key: SENTRY-1964
>                 URL: https://issues.apache.org/jira/browse/SENTRY-1964
>             Project: Sentry
>          Issue Type: Improvement
>          Components: Sentry
>    Affects Versions: 2.0.0
>            Reporter: Na Li
>            Assignee: Na Li
>            Priority: Critical
>         Attachments: SENTRY-1964.001.patch, SENTRY-1964.001.patch, 
> SENTRY-1964.002.patch
>
>
> Right now, sentry saves partition info from HMS and send it to HDFS. HDFS 
> only needs database and table info, and does not need partition info for ACL 
> unless the partion location is not sharing the same prefix of its table.
> The partition data amount is huge, and causes performance issue. We can 
> optimize it by not saving and not sending partition info if it shares the 
> same path of its table. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to