[
https://issues.apache.org/jira/browse/SENTRY-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16152048#comment-16152048
]
Alexander Kolbasov commented on SENTRY-1916:
--------------------------------------------
[~hahao] [[email protected]] How does Sentry handle HDFS sync on locations
outside of the hive prefix specified in
{{sentry.hdfs.integration.path.prefixes}}?
In older code, {{SentryPlugin}} is innitialized with prefixes from
{{sentry.hdfs.integration.path.prefixes}}. These were passed to
{{UpdateForwarder}} which ignored all paths outside the prefixes:
{code}
void addPathsToAuthzObject(String authzObj,
List<List<String>> authzObjPathElements, boolean createNew) {
Set<Entry> entries = authzObjToPath.get(authzObj);
if (entries != null) {
Set<Entry> newEntries = new HashSet<Entry>(authzObjPathElements.size());
for (List<String> pathElements : authzObjPathElements) {
Entry e = root.createAuthzObjPath(pathElements, authzObj);
if (e != null) {
newEntries.add(e);
} else {
LOG.info("Path outside prefix"); // Here
}
}
entries.addAll(newEntries);
} else {
if (createNew) {
addAuthzObject(authzObj, authzObjPathElements);
} else {
LOG.warn("Path was not added to AuthzObject, could not find key in
authzObjToPath. authzObj = " + authzObj +
" authzObjPathElements=" + authzObjPathElements);
}
}
}
{code}
So how did this work for any tables which contained partitions outside the
prefix but for which some access permissions were specified?
> Sentry should not store paths outside of the prefix
> ---------------------------------------------------
>
> Key: SENTRY-1916
> URL: https://issues.apache.org/jira/browse/SENTRY-1916
> Project: Sentry
> Issue Type: Bug
> Components: Sentry
> Affects Versions: 2.0.0
> Reporter: Alexander Kolbasov
> Assignee: Alexander Kolbasov
> Attachments: SENTRY-1916.01.patch
>
>
> Before Sentry 2.0 we were only sending paths which were inside Hive prefix to
> HDFS. With Sentry HA we changed that and store all paths. This significantly
> increases the amount of memory when there are many external tables.
> [~vamsee] [~spena] [[email protected]] [~hahao] [[email protected]]
> [~mcrocker] FYI
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)