[ 
https://issues.apache.org/jira/browse/IMPALA-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-10272:
------------------------------------
    Description: 
[~thundergun] reported an issue that analyzing a LOAD DATA statement fails in 
checking  access to the source file while a Ranger HDFS policy actually exists 
to allow the access. Impala only loads the permissions from HDFS and check 
accesses by itself. Related codes: 
https://github.com/apache/impala/blob/ee4043e1a0940ae5711c68336d1ad522631d0e35/fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java#L195-L206

When Ranger authorization is enabled, this could be wrong if the HDFS 
permissions is more restrict than the Ranger policies. According to the Ranger 
document: 
[https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=57901344#RangerUserGuide(workinprogress)-HDFSPolicycreation]
{quote}when the NameNode receives a user request, the Ranger Plugin checks for 
policies set through the Ranger Policy Manager. Then, if there are no policies 
authorizing the request, the Ranger plugin checks for permissions set in HDFS.
{quote}

We currently don't have an embeded ranger-hdfs plugin to check this locally. 
For a quick fix, I think when Ranger authz is enabled, we can check the access 
using {{FileSystem#access(Path path, FsAction mode)}} to invoke a NameNode RPC 
to respect Ranger-HDFS policies.

  was:
[~thundergun] reported an issue that analyzing a LOAD DATA statement fails in 
checking  access to the source file while a Ranger HDFS policy actually exists 
to allow the access. Impala only loads the permissions from HDFS and check 
accesses by itself. Related codes: 
https://github.com/apache/impala/blob/ee4043e1a0940ae5711c68336d1ad522631d0e35/fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java#L195-L206

When Ranger authorization is enabled, this could be wrong if the HDFS 
permissions is more restrict than the Ranger policies. According to the Ranger 
document: 
[https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=57901344#RangerUserGuide(workinprogress)-HDFSPolicycreation]
{quote}when the NameNode receives a user request, the Ranger Plugin checks for 
policies set through the Ranger Policy Manager. Then, if there are no policies 
authorizing the request, the Ranger plugin checks for permissions set in HDFS.
{quote}

We currently don't have an embeded ranger-hdfs plugin to check this locally. I 
think we can check the access using {{FileSystem#access(Path path, FsAction 
mode)}} to invoke a NameNode RPC as a quick fix for this.


> LOAD DATA should respect Ranger-HDFS policies
> ---------------------------------------------
>
>                 Key: IMPALA-10272
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10272
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Quanlong Huang
>            Priority: Critical
>
> [~thundergun] reported an issue that analyzing a LOAD DATA statement fails in 
> checking  access to the source file while a Ranger HDFS policy actually 
> exists to allow the access. Impala only loads the permissions from HDFS and 
> check accesses by itself. Related codes: 
> https://github.com/apache/impala/blob/ee4043e1a0940ae5711c68336d1ad522631d0e35/fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java#L195-L206
> When Ranger authorization is enabled, this could be wrong if the HDFS 
> permissions is more restrict than the Ranger policies. According to the 
> Ranger document: 
> [https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=57901344#RangerUserGuide(workinprogress)-HDFSPolicycreation]
> {quote}when the NameNode receives a user request, the Ranger Plugin checks 
> for policies set through the Ranger Policy Manager. Then, if there are no 
> policies authorizing the request, the Ranger plugin checks for permissions 
> set in HDFS.
> {quote}
> We currently don't have an embeded ranger-hdfs plugin to check this locally. 
> For a quick fix, I think when Ranger authz is enabled, we can check the 
> access using {{FileSystem#access(Path path, FsAction mode)}} to invoke a 
> NameNode RPC to respect Ranger-HDFS policies.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to