Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/20221 )
Change subject: IMPALA-12291 impala checks hdfs ranger policy ...................................................................... Patch Set 9: (2 comments) Thanks for the reply Halim! Other than changing how the catalog server initializes the access level of an HdfsTable during table loading, it may also make sense to disable analyzeWriteAccess() at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java#L406-L408 depending on the value of the newly introduced startup flag. I do not have a strong objection against adding a new startup flag in this patch. But maybe Quanlong and Aman could chime in to see if there is a better alternative since we have already got a lot of flags. http://gerrit.cloudera.org:8080/#/c/20221/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/20221/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@852 PS5, Line 852: location = location.getParent(); > Your test code will be very helpful. Thank you Fang-Yu. Thanks Halim! Not a problem. Adding a startup flag like 'hdfs_permission_check' or 'skip_fs_permissions_check_in_analysis' to give the Impala administrator a choice is reasonable and it does not have to depend on whether Ranger is enabled. It would be good that we mention the purpose of not performing the file system permissions check somewhere (maybe in the commit message). This prevents users from encountering an AnalysisException during query analysis when the target table or any partition inside is not writable to the Impala service according to Impala's FsPermissionChecker() even though the Impala service user is granted the READ and WRITE privileges on the respective file system paths via the Ranger policy repository of the corresponding storage. If we add such a startup flag, e.g., 'skip_fs_permissions_check_in_analysis', then we could still add end-to-end tests to https://github.com/apache/impala/blob/master/tests/authorization/test_ranger.py like the following to make sure file system permissions check is still performed if the Impala administrator does not want to skip it. @pytest.mark.execute_serially @SkipIfFS.hdfs_acls @CustomClusterTestSuite.with_args( impalad_args="{0} {1}".format(CATALOGD_ARGS, "--skip_fs_permissions_check_in_analysis=false"), catalogd_args="{0} {1}".format(CATALOGD_ARGS, "--skip_fs_permissions_check_in_analysis=false")) def test_insert_with_catalog_v1_not_skip_fs_permissions_check(self, unique_name): self._test_insert_with_catalog_v1(unique_name, False) def _test_insert_with_catalog_v1(self, unique_name, skip_fs_permissions_check=True): """ Test that when Ranger is the authorization provider in the legacy catalog mode, Impala skips or performs the file system permissions checking in query analysis depending on the startup flag 'skip_fs_permissions_check_in_analysis'. """ user = getuser() admin_client = self.create_impala_client() unique_database = unique_name + "_db" unique_table = unique_name + "_tbl" table_path = "test-warehouse/{0}.db/{1}".format(unique_database, unique_table) try: admin_client.execute("drop database if exists {0} cascade" .format(unique_database), user=ADMIN) admin_client.execute("create database {0}".format(unique_database), user=ADMIN) admin_client.execute("create table {0}.{1} (x int)" .format(unique_database, unique_table), user=ADMIN) admin_client.execute("grant insert on table {0}.{1} to user {2}" .format(unique_database, unique_table, user)) # Change the owner user and group of the HDFS path corresponding to the table # so that according to Impala's FsPermissionChecker, the table could not be # writable to the user that loads the table. This user usually is the one # representing the Impala service. self.hdfs_client.chown(table_path, "another_user", "another_group") # Invalidate the table metadata to force the catalog server to reload the HDFS # table. admin_client.execute("invalidate metadata {0}.{1}" .format(unique_database, unique_table), user=ADMIN) # Verify that Impala skips or performs the permissions checking in # HdfsTable#getAvailableAccessLevel() depending on the startup flag # 'skip_fs_permissions_check_in_analysis'. query = "insert into {0}.{1} values (1)".format(unique_database, unique_table) if skip_fs_permissions_check: self._run_query_as_user(query, user, True) else: result = self._run_query_as_user(query, user, False) err = "Unable to INSERT into target table" assert err in str(result) finally: admin_client.execute("revoke insert on table {0}.{1} from user {2}" .format(unique_database, unique_table, user)) admin_client.execute("drop database if exists {0} cascade" .format(unique_database), user=ADMIN) http://gerrit.cloudera.org:8080/#/c/20221/5/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/20221/5/fe/src/main/java/org/apache/impala/service/Frontend.java@2109 PS5, Line 2109: LOG.warn("Analysis Exception query {}: {}", : queryCtx.client_request.stmt, errorMsg); > In retrospect, I think I found it difficult to know which query is in troub Ack -- To view, visit http://gerrit.cloudera.org:8080/20221 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id33c400fbe0c918b6b65d713b09009512835a4c9 Gerrit-Change-Number: 20221 Gerrit-PatchSet: 9 Gerrit-Owner: Halim Kim <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Halim Kim <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Wed, 03 Apr 2024 23:31:53 +0000 Gerrit-HasComments: Yes
