[ 
https://issues.apache.org/jira/browse/SENTRY-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov updated SENTRY-1779:
---------------------------------------
    Affects Version/s: 1.8.0

> HDFS full snapshot should limit to a set of path prefixes 
> ----------------------------------------------------------
>
>                 Key: SENTRY-1779
>                 URL: https://issues.apache.org/jira/browse/SENTRY-1779
>             Project: Sentry
>          Issue Type: Improvement
>          Components: Hdfs Plugin
>    Affects Versions: 1.5.1, 1.8.0, sentry-ha-redesign
>            Reporter: Vamsee Yarlagadda
>
> Currently when the cluster starts up, HDFS requests aa full snapshot from 
> Sentry and Sentry returns a complete list of all privileges and permissions 
> to HDFS plugin and upon receiving the data, the plugin filters the content to 
> a subset that matches the prefixes. And this happens every time during the 
> service restart (HDFS) or upon the expiry (every 24hrs). So during this time, 
> Sentry is doing the heavy lifting work of loading all the metadata on to the 
> memory to send the full snapshot to HDFS even though HDFS might not care 
> about most of the data. During this time, the memory requirement for Sentry 
> spikes and could hit OOM given if the metadata can get huge over time.
> A better option would be that the plugin asks for full snapshot for a list of 
> prefixes. And Sentry would query the database for permissions by filtering 
> with the paths supplied. Thereby, reducing the memory usage of Sentry and 
> also reducing the amount of data being transferred over to the HDFS. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to