Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/23569 )
Change subject: IMPALA-14507: Register column-level privilege requests for INSERT ...................................................................... Patch Set 18: (4 comments) I replied to Csaba's comments. Will try to prepare patch set 19 based on the discussion. This patch becomes more involved than I first expected, especially the Preconditions check as discussed at https://gerrit.cloudera.org/c/23569/14/fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java#168. http://gerrit.cloudera.org:8080/#/c/23569/17/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java: http://gerrit.cloudera.org:8080/#/c/23569/17/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@212 PS17, Line 212: // The full privilege check for the database will be done as part of the INSERT : // analysis. : FeDb db = analyzer.getDb(createStmt_.getDb(), Privilege.ANY); > I am not sure if this is correct - if the user does not have any privilege on > a db name, then they shouldn't be able to learn whether it exists or not - or > the authorization was already done before this? With "FeDb db = analyzer.getDb(createStmt_.getDb(), /* throwIfDoesNotExist */ false)", for a query like "create table test_db_10.alltypestiny_id as select id from functional.alltypestiny" submitted by a user, e.g., non_owner, that does not have any privilege, the following error message will be returned, when the target table 'test_db_10.alltypestiny_id' does not exist. AuthorizationException: User 'non_owner' does not have privileges to execute 'CREATE' on: test_db_10 Without this change (patch set 16), the error message was the following. AuthorizationException: User 'non_owner' does not have privileges to execute 'CREATE' on: test_db_10 The error messages are the same before and after this line of change, and in this case whether the target table exists or not is not revealed. This is because Impala's authorization checker processes the CREATE privilege on the table 'test_db_10.alltypestiny_id' first during authorization. When the requesting user does not have the necessary (CREATE) privilege on this (non-existing) table, i.e., hasAccess() returns false, the branch of "privilege == Privilege.CREATE" in BaseAuthorizationChecker#checkAccess() would be exercised. private void checkAccess(AuthorizationContext authzCtx, User user, PrivilegeRequest privilegeRequest) throws AuthorizationException, InternalException { Preconditions.checkNotNull(privilegeRequest); if (hasAccess(authzCtx, user, privilegeRequest)) return; ... } else if (privilege == Privilege.CREATE && privilegeRequest.getAuthorizable().getType() == Type.TABLE) { // Creating a table requires CREATE on a database and we shouldn't // expose the table name. throw new AuthorizationException(String.format( "User '%s' does not have privileges%s to execute '%s' on: %s", user.getName(), grantOption(privilegeRequest.hasGrantOption()), privilege, privilegeRequest.getAuthorizable().getDbName())); ... } On the other hand, getDb(String dbName, Privilege privilege) in https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/Analyzer.java also does the following. addAccessEvent(new TAccessEvent(dbName, TCatalogObjectType.DATABASE, privilege.toString())); So with this line of change, one fewer access event will be logged. I guess/think this is not what we want. Therefore, I am more inclined to do without this line of change in patch set 17. However, without this change, we will have to think about what privilege request to register in getDb(String dbName, Privilege privilege). Note that with patch set 16, we would register the following. privReq = {PrivilegeRequest@9473} authorizable_ = {AuthorizableColumn@9474} columnName_ = "*" dbName_ = "test_db_10" tableName_ = "*" ownerUser_ = "admin" columns_ = {ArrayList@9475} size = 0 privilege_ = {Privilege@7676} "ANY" This privilege could be thought of as a wildcard privilege request representing any column of any table under the database 'test_db_10', which clearly contradicts the following code comment in authorize(AuthorizationContext authzCtx, AnalysisResult analysisResult, FeCatalog catalog) of https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java because we do not register any respective wildcard privilege request for the table 'test_db_10.*'. // Authorize statements that may produce several hierarchical privilege requests. // Such a statement always has a corresponding table-level privilege request if it // has column-level privilege request. We could not catch this with the following Preconditions check as discussed at https://gerrit.cloudera.org/c/23569/14/fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java#168. Preconditions.checkState((requests.isEmpty() || !(privReq.getAuthorizable().getType() == Authorizable.Type.COLUMN)) || (requests.get(0).getAuthorizable().getType() == Authorizable.Type.TABLE && requests.get(0).getPrivilege() == Privilege.SELECT)); In https://gerrit.cloudera.org/c/23569/18/fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java, I proposed a revised Preconditions check, and changed the registered privilege request to an ANY request on the database 'test_db_10'. Patch set 18 failed some existing authorization-related tests, but I will take a closer look to verify if there is some security-related issue about the proposed changes in patch set 18. http://gerrit.cloudera.org:8080/#/c/23569/18/tests/authorization/test_ranger.py File tests/authorization/test_ranger.py: http://gerrit.cloudera.org:8080/#/c/23569/18/tests/authorization/test_ranger.py@1521 PS18, Line 1521: unique_table = unique_name + "_tbl" > Is it really needed to use a random name? Can't we just use unique_database Yeah. I think we could do as you suggested. http://gerrit.cloudera.org:8080/#/c/23569/18/tests/authorization/test_ranger.py@1539 PS18, Line 1539: admin_client.execute("drop database if exists {0} cascade" : .format(unique_database)) > This shouldn't be needed as the name is unique I guess we always assume that when we are running an end-to-end test that would create a 'unique_database', there is not already a database with the same name. In this case, we could remove this in the next patch set. http://gerrit.cloudera.org:8080/#/c/23569/18/tests/authorization/test_ranger.py@1598 PS18, Line 1598: TestRanger._remove_policies(unique_database, tbl, "*") > Shouldn't we do this in the previous finally block, if the policy was creat Yes we could move this to the previous finally block. I will address this in the next patch set. -- To view, visit http://gerrit.cloudera.org:8080/23569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2ef61801d3b394c56702b193c250492a62b111df Gerrit-Change-Number: 23569 Gerrit-PatchSet: 18 Gerrit-Owner: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Mon, 12 Jan 2026 22:53:57 +0000 Gerrit-HasComments: Yes
