[ 
https://issues.apache.org/jira/browse/IMPALA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830738#comment-17830738
 ] 

Fang-Yu Rao commented on IMPALA-11871:
--------------------------------------

Hi [~MikaelSmith], my current understanding is that this is not a regression 
from earlier releases. It's more like a feature request for usability.

The method that is performing the permissions checking 
({*}analyzeWriteAccess{*}()) was added in IMPALA-7311. The purpose, I guess, 
was to make sure the Impala service has the necessary write permissions as 
early as possible, i.e., during the query analysis phase (v.s. in the query 
execution phase).

After Impala started supporting Ranger as its authorization provider, ideally, 
a cluster administrator should be able to manage the permissions on HDFS via 
either a) Ranger's policy repository for HDFS, or b) the HDFS Access Control 
Lists (HDFS ACLs). But at the moment, Impala's coordinator unconditionally 
performs the permissions-checking without checking Ranger's policy repository 
of HDFS.

IMPALA-10272 resolved a similar issue for the LOAD DATA statement. We could 
resolve this JIRA using the same approach there, where Impala's frontend calls 
*hadoop.fs.FileSystem.access(Path path, FsAction mode)* to check the actual 
access permissions, which could also reflect the permissions manged via 
Ranger's HDFS policy repository.

> INSERT statement does not respect Ranger policies for HDFS
> ----------------------------------------------------------
>
>                 Key: IMPALA-11871
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11871
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Fang-Yu Rao
>            Assignee: Fang-Yu Rao
>            Priority: Major
>
> In a cluster with Ranger auth (and with legacy catalog mode), even if you 
> provide RWX to cm_hdfs -> all-path for the user impala, inserting into a 
> table whose HDFS POSIX permissions happen to exclude impala access will 
> result in an
> {noformat}
> "AnalysisException: Unable to INSERT into target table (default.t1) because 
> Impala does not have WRITE access to HDFS location: 
> hdfs://nightly-71x-vx-2.nightly-71x-vx.root.hwx.site:8020/warehouse/tablespace/external/hive/t1"{noformat}
>  
> {noformat}
> [root@nightly-71x-vx-3 ~]# hdfs dfs -getfacl 
> /warehouse/tablespace/external/hive/t1
> file: /warehouse/tablespace/external/hive/t1 
> owner: hive 
> group: supergroup
> user::rwx
> user:impala:rwx #effective:r-x
> group::rwx #effective:r-x
> mask::r-x
> other::---
> default:user::rwx
> default:user:impala:rwx
> default:group::rwx
> default:mask::rwx
> default:other::--- {noformat}
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ANALYSIS
> Stack trace from a version of Cloudera's distribution of Impala (impalad 
> version 3.4.0-SNAPSHOT RELEASE (build 
> {*}db20b59a093c17ea4699117155d58fe874f7d68f{*})):
> {noformat}
> at 
> org.apache.impala.catalog.FeFsTable$Utils.checkWriteAccess(FeFsTable.java:585)
> at 
> org.apache.impala.analysis.InsertStmt.analyzeWriteAccess(InsertStmt.java:545)
> at org.apache.impala.analysis.InsertStmt.analyze(InsertStmt.java:391)
> at 
> org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:463)
> at 
> org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:426)
> at org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1570)
> at org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1536)
> at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1506)
> at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:155){noformat}
> The exception occurs at analysis time, so I tested and succeeded in writing 
> directly into the said directory.
> {noformat}
> [root@nightly-71x-vx-3 ~]# hdfs dfs -touchz 
> /warehouse/tablespace/external/hive/t1/test
> [root@nightly-71x-vx-3 ~]# hdfs dfs -ls 
> /warehouse/tablespace/external/hive/t1/
> Found 8 items
> rw-rw---+ 3 hive supergroup 417 2023-01-27 17:37 
> /warehouse/tablespace/external/hive/t1/000000_0
> rw-rw---+ 3 hive supergroup 417 2023-01-27 17:44 
> /warehouse/tablespace/external/hive/t1/000000_0_copy_1
> rw-rw---+ 3 hive supergroup 417 2023-01-27 17:49 
> /warehouse/tablespace/external/hive/t1/000000_0_copy_2
> rw-rw---+ 3 hive supergroup 417 2023-01-27 17:53 
> /warehouse/tablespace/external/hive/t1/000000_0_copy_3
> rw-rw---+ 3 impala hive 355 2023-01-27 17:17 
> /warehouse/tablespace/external/hive/t1/4c4477c12c51ad96-3126b52d00000000_2029811630_data.0.parq
> rw-rw---+ 3 impala hive 355 2023-01-27 17:39 
> /warehouse/tablespace/external/hive/t1/9945b25bb37d1ff2-473c147800000000_574471191_data.0.parq
> drwxrwx---+ - impala hive 0 2023-01-27 17:39 
> /warehouse/tablespace/external/hive/t1/_impala_insert_staging
> rw-rw---+ 3 impala supergroup 0 2023-01-27 18:01 
> /warehouse/tablespace/external/hive/t1/test{noformat}
> Reviewing the code[1], I traced the {{TAccessLevel}} to the catalogd. And if 
> I add user impala to group supergroup on the catalogd host, this query will 
> succeed past the authorization.
> Additionally, this query does not trip up during analysis when catalog v2 is 
> enabled because the method {{getFirstLocationWithoutWriteAccess()}} is not 
> implemented there yet and always returns null[2].
> [1] 
> [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L494-L504]
> [2] 
> [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java#L295-L298]
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Ideally, when Ranger authorization is in place, we should:
> 1) Not check access level during analysis
> 2) Incorporate Ranger ACLs during analysis



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to