[
https://issues.apache.org/jira/browse/SENTRY-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexander Kolbasov updated SENTRY-1779:
---------------------------------------
Affects Version/s: 1.8.0
> HDFS full snapshot should limit to a set of path prefixes
> ----------------------------------------------------------
>
> Key: SENTRY-1779
> URL: https://issues.apache.org/jira/browse/SENTRY-1779
> Project: Sentry
> Issue Type: Improvement
> Components: Hdfs Plugin
> Affects Versions: 1.5.1, 1.8.0, sentry-ha-redesign
> Reporter: Vamsee Yarlagadda
>
> Currently when the cluster starts up, HDFS requests aa full snapshot from
> Sentry and Sentry returns a complete list of all privileges and permissions
> to HDFS plugin and upon receiving the data, the plugin filters the content to
> a subset that matches the prefixes. And this happens every time during the
> service restart (HDFS) or upon the expiry (every 24hrs). So during this time,
> Sentry is doing the heavy lifting work of loading all the metadata on to the
> memory to send the full snapshot to HDFS even though HDFS might not care
> about most of the data. During this time, the memory requirement for Sentry
> spikes and could hit OOM given if the metadata can get huge over time.
> A better option would be that the plugin asks for full snapshot for a list of
> prefixes. And Sentry would query the database for permissions by filtering
> with the paths supplied. Thereby, reducing the memory usage of Sentry and
> also reducing the amount of data being transferred over to the HDFS.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)