Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20221 )

Change subject: IMPALA-12291 impala checks hdfs ranger policy
......................................................................


Patch Set 9:

(2 comments)

Thanks for the reply Halim!

Other than changing how the catalog server initializes the access level of an 
HdfsTable during table loading, it may also make sense to disable 
analyzeWriteAccess() at
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java#L406-L408
 depending on the value of the newly introduced startup flag.

I do not have a strong objection against adding a new startup flag in this 
patch. But maybe Quanlong and Aman could chime in to see if there is a better 
alternative since we have already got a lot of flags.

http://gerrit.cloudera.org:8080/#/c/20221/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/20221/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@852
PS5, Line 852:         location = location.getParent();
> Your test code will be very helpful. Thank you Fang-Yu.
Thanks Halim! Not a problem.

Adding a startup flag like 'hdfs_permission_check' or 
'skip_fs_permissions_check_in_analysis' to give the Impala administrator a 
choice is reasonable and it does not have to depend on whether Ranger is 
enabled.

It would be good that we mention the purpose of not performing the file system 
permissions check somewhere (maybe in the commit message).

    This prevents users from encountering an AnalysisException during query
    analysis when the target table or any partition inside is not writable
    to the Impala service according to Impala's FsPermissionChecker() even
    though the Impala service user is granted the READ and WRITE privileges
    on the respective file system paths via the Ranger policy repository of the
    corresponding storage.

If we add such a startup flag, e.g., 'skip_fs_permissions_check_in_analysis', 
then we could still add end-to-end tests to 
https://github.com/apache/impala/blob/master/tests/authorization/test_ranger.py 
like the following to make sure file system permissions check is still 
performed if the Impala administrator does not want to skip it.

  @pytest.mark.execute_serially
  @SkipIfFS.hdfs_acls
  @CustomClusterTestSuite.with_args(
    impalad_args="{0} {1}".format(CATALOGD_ARGS,
                                  
"--skip_fs_permissions_check_in_analysis=false"),
    catalogd_args="{0} {1}".format(CATALOGD_ARGS,
                                   
"--skip_fs_permissions_check_in_analysis=false"))
  def test_insert_with_catalog_v1_not_skip_fs_permissions_check(self, 
unique_name):
    self._test_insert_with_catalog_v1(unique_name, False)

  def _test_insert_with_catalog_v1(self, unique_name, 
skip_fs_permissions_check=True):
    """
    Test that when Ranger is the authorization provider in the legacy catalog 
mode,
    Impala skips or performs the file system permissions checking in query 
analysis
    depending on the startup flag 'skip_fs_permissions_check_in_analysis'.
    """
    user = getuser()
    admin_client = self.create_impala_client()
    unique_database = unique_name + "_db"
    unique_table = unique_name + "_tbl"
    table_path = "test-warehouse/{0}.db/{1}".format(unique_database, 
unique_table)
    try:
      admin_client.execute("drop database if exists {0} cascade"
           .format(unique_database), user=ADMIN)
      admin_client.execute("create database {0}".format(unique_database), 
user=ADMIN)
      admin_client.execute("create table {0}.{1} (x int)"
          .format(unique_database, unique_table), user=ADMIN)
      admin_client.execute("grant insert on table {0}.{1} to user {2}"
           .format(unique_database, unique_table, user))

      # Change the owner user and group of the HDFS path corresponding to the 
table
      # so that according to Impala's FsPermissionChecker, the table could not 
be
      # writable to the user that loads the table. This user usually is the one
      # representing the Impala service.
      self.hdfs_client.chown(table_path, "another_user", "another_group")

      # Invalidate the table metadata to force the catalog server to reload the 
HDFS
      # table.
      admin_client.execute("invalidate metadata {0}.{1}"
           .format(unique_database, unique_table), user=ADMIN)

      # Verify that Impala skips or performs the permissions checking in
      # HdfsTable#getAvailableAccessLevel() depending on the startup flag
      # 'skip_fs_permissions_check_in_analysis'.
      query = "insert into {0}.{1} values (1)".format(unique_database, 
unique_table)
      if skip_fs_permissions_check:
        self._run_query_as_user(query, user, True)
      else:
        result = self._run_query_as_user(query, user, False)
        err = "Unable to INSERT into target table"
        assert err in str(result)
    finally:
      admin_client.execute("revoke insert on table {0}.{1} from user {2}"
          .format(unique_database, unique_table, user))
      admin_client.execute("drop database if exists {0} cascade"
          .format(unique_database), user=ADMIN)


http://gerrit.cloudera.org:8080/#/c/20221/5/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/20221/5/fe/src/main/java/org/apache/impala/service/Frontend.java@2109
PS5, Line 2109:           LOG.warn("Analysis Exception query {}: {}",
              :               queryCtx.client_request.stmt, errorMsg);
> In retrospect, I think I found it difficult to know which query is in troub
Ack



--
To view, visit http://gerrit.cloudera.org:8080/20221
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id33c400fbe0c918b6b65d713b09009512835a4c9
Gerrit-Change-Number: 20221
Gerrit-PatchSet: 9
Gerrit-Owner: Halim Kim <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Halim Kim <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Wed, 03 Apr 2024 23:31:53 +0000
Gerrit-HasComments: Yes

Reply via email to