[
https://issues.apache.org/jira/browse/IMPALA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Quanlong Huang updated IMPALA-11871:
------------------------------------
Fix Version/s: Impala 4.5.0
> INSERT statement does not respect Ranger policies for HDFS
> ----------------------------------------------------------
>
> Key: IMPALA-11871
> URL: https://issues.apache.org/jira/browse/IMPALA-11871
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Reporter: Fang-Yu Rao
> Assignee: Fang-Yu Rao
> Priority: Major
> Fix For: Impala 4.5.0
>
>
> In a cluster with Ranger auth (and with legacy catalog mode), even if you
> provide RWX to cm_hdfs -> all-path for the user impala, inserting into a
> table whose HDFS POSIX permissions happen to exclude impala access will
> result in an
> {noformat}
> "AnalysisException: Unable to INSERT into target table (default.t1) because
> Impala does not have WRITE access to HDFS location:
> hdfs://nightly-71x-vx-2.nightly-71x-vx.root.hwx.site:8020/warehouse/tablespace/external/hive/t1"{noformat}
>
> {noformat}
> [root@nightly-71x-vx-3 ~]# hdfs dfs -getfacl
> /warehouse/tablespace/external/hive/t1
> file: /warehouse/tablespace/external/hive/t1
> owner: hive
> group: supergroup
> user::rwx
> user:impala:rwx #effective:r-x
> group::rwx #effective:r-x
> mask::r-x
> other::---
> default:user::rwx
> default:user:impala:rwx
> default:group::rwx
> default:mask::rwx
> default:other::--- {noformat}
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ANALYSIS
> Stack trace from a version of Cloudera's distribution of Impala (impalad
> version 3.4.0-SNAPSHOT RELEASE (build
> {*}db20b59a093c17ea4699117155d58fe874f7d68f{*})):
> {noformat}
> at
> org.apache.impala.catalog.FeFsTable$Utils.checkWriteAccess(FeFsTable.java:585)
> at
> org.apache.impala.analysis.InsertStmt.analyzeWriteAccess(InsertStmt.java:545)
> at org.apache.impala.analysis.InsertStmt.analyze(InsertStmt.java:391)
> at
> org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:463)
> at
> org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:426)
> at org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1570)
> at org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1536)
> at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1506)
> at
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:155){noformat}
> The exception occurs at analysis time, so I tested and succeeded in writing
> directly into the said directory.
> {noformat}
> [root@nightly-71x-vx-3 ~]# hdfs dfs -touchz
> /warehouse/tablespace/external/hive/t1/test
> [root@nightly-71x-vx-3 ~]# hdfs dfs -ls
> /warehouse/tablespace/external/hive/t1/
> Found 8 items
> rw-rw---+ 3 hive supergroup 417 2023-01-27 17:37
> /warehouse/tablespace/external/hive/t1/000000_0
> rw-rw---+ 3 hive supergroup 417 2023-01-27 17:44
> /warehouse/tablespace/external/hive/t1/000000_0_copy_1
> rw-rw---+ 3 hive supergroup 417 2023-01-27 17:49
> /warehouse/tablespace/external/hive/t1/000000_0_copy_2
> rw-rw---+ 3 hive supergroup 417 2023-01-27 17:53
> /warehouse/tablespace/external/hive/t1/000000_0_copy_3
> rw-rw---+ 3 impala hive 355 2023-01-27 17:17
> /warehouse/tablespace/external/hive/t1/4c4477c12c51ad96-3126b52d00000000_2029811630_data.0.parq
> rw-rw---+ 3 impala hive 355 2023-01-27 17:39
> /warehouse/tablespace/external/hive/t1/9945b25bb37d1ff2-473c147800000000_574471191_data.0.parq
> drwxrwx---+ - impala hive 0 2023-01-27 17:39
> /warehouse/tablespace/external/hive/t1/_impala_insert_staging
> rw-rw---+ 3 impala supergroup 0 2023-01-27 18:01
> /warehouse/tablespace/external/hive/t1/test{noformat}
> Reviewing the code[1], I traced the {{TAccessLevel}} to the catalogd. And if
> I add user impala to group supergroup on the catalogd host, this query will
> succeed past the authorization.
> Additionally, this query does not trip up during analysis when catalog v2 is
> enabled because the method {{getFirstLocationWithoutWriteAccess()}} is not
> implemented there yet and always returns null[2].
> [1]
> [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L494-L504]
> [2]
> [https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java#L295-L298]
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Ideally, when Ranger authorization is in place, we should:
> 1) Not check access level during analysis
> 2) Incorporate Ranger ACLs during analysis
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]