[ 
https://issues.apache.org/jira/browse/SENTRY-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16152048#comment-16152048
 ] 

Alexander Kolbasov commented on SENTRY-1916:
--------------------------------------------

[~hahao] [[email protected]] How does Sentry handle HDFS sync on locations 
outside of the hive prefix specified in 
{{sentry.hdfs.integration.path.prefixes}}?

In older code, {{SentryPlugin}} is innitialized with prefixes from 
{{sentry.hdfs.integration.path.prefixes}}. These were passed to 
{{UpdateForwarder}} which ignored all paths outside the prefixes:

{code}
  void addPathsToAuthzObject(String authzObj,
      List<List<String>> authzObjPathElements, boolean createNew) {
    Set<Entry> entries = authzObjToPath.get(authzObj);
    if (entries != null) {
      Set<Entry> newEntries = new HashSet<Entry>(authzObjPathElements.size());
      for (List<String> pathElements : authzObjPathElements) {
        Entry e = root.createAuthzObjPath(pathElements, authzObj);
        if (e != null) {
          newEntries.add(e);
        } else {
          LOG.info("Path outside prefix"); // Here
        }
      }
      entries.addAll(newEntries);
    } else {
      if (createNew) {
        addAuthzObject(authzObj, authzObjPathElements);
      } else {
        LOG.warn("Path was not added to AuthzObject, could not find key in 
authzObjToPath. authzObj = " + authzObj +
                " authzObjPathElements=" + authzObjPathElements);
      }
    }
  }
{code}

So how did this work for any tables which contained partitions outside the 
prefix but for which some access permissions were specified?

> Sentry should not store paths outside of the prefix
> ---------------------------------------------------
>
>                 Key: SENTRY-1916
>                 URL: https://issues.apache.org/jira/browse/SENTRY-1916
>             Project: Sentry
>          Issue Type: Bug
>          Components: Sentry
>    Affects Versions: 2.0.0
>            Reporter: Alexander Kolbasov
>            Assignee: Alexander Kolbasov
>         Attachments: SENTRY-1916.01.patch
>
>
> Before Sentry 2.0 we were only sending paths which were inside Hive prefix to 
> HDFS. With Sentry HA we changed that and store all paths. This significantly 
> increases the amount of memory when there are many external tables.
> [~vamsee] [~spena] [[email protected]] [~hahao] [[email protected]] 
> [~mcrocker] FYI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to