Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23569 )

Change subject: IMPALA-14507: Register column-level privilege requests for 
INSERT
......................................................................


Patch Set 18:

(4 comments)

I replied to Csaba's comments. Will try to prepare patch set 19 based on the 
discussion.

This patch becomes more involved than I first expected, especially the 
Preconditions check as discussed at 
https://gerrit.cloudera.org/c/23569/14/fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java#168.

http://gerrit.cloudera.org:8080/#/c/23569/17/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/23569/17/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@212
PS17, Line 212:     // The full privilege check for the database will be done 
as part of the INSERT
              :     // analysis.
              :     FeDb db = analyzer.getDb(createStmt_.getDb(), 
Privilege.ANY);
> I am not sure if this is correct - if the user does not have any privilege on 
> a db name, then they shouldn't be able to learn whether it exists or not - or 
> the authorization was already done before this?

With "FeDb db = analyzer.getDb(createStmt_.getDb(), /* throwIfDoesNotExist */ 
false)", for a query like "create table test_db_10.alltypestiny_id as select id 
from functional.alltypestiny" submitted by a user, e.g., non_owner, that does 
not have any privilege, the following error message will be returned, when the 
target table 'test_db_10.alltypestiny_id' does not exist.

 AuthorizationException: User 'non_owner' does not have privileges to execute 
'CREATE' on: test_db_10

Without this change (patch set 16), the error message was the following.

 AuthorizationException: User 'non_owner' does not have privileges to execute 
'CREATE' on: test_db_10

The error messages are the same before and after this line of change, and in 
this case whether the target table exists or not is not revealed.

This is because Impala's authorization checker processes the CREATE privilege 
on the table 'test_db_10.alltypestiny_id' first during authorization. When the 
requesting user does not have the necessary (CREATE) privilege on this 
(non-existing) table, i.e., hasAccess() returns false, the branch of "privilege 
== Privilege.CREATE" in BaseAuthorizationChecker#checkAccess() would be 
exercised.

  private void checkAccess(AuthorizationContext authzCtx, User user,
      PrivilegeRequest privilegeRequest)
      throws AuthorizationException, InternalException {
    Preconditions.checkNotNull(privilegeRequest);

    if (hasAccess(authzCtx, user, privilegeRequest)) return;
      ...
    } else if (privilege == Privilege.CREATE &&
        privilegeRequest.getAuthorizable().getType() == Type.TABLE) {
      // Creating a table requires CREATE on a database and we shouldn't
      // expose the table name.
      throw new AuthorizationException(String.format(
          "User '%s' does not have privileges%s to execute '%s' on: %s",
          user.getName(), grantOption(privilegeRequest.hasGrantOption()), 
privilege,
          privilegeRequest.getAuthorizable().getDbName()));
    ...
  }

On the other hand, getDb(String dbName, Privilege privilege) in 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
 also does the following.

     addAccessEvent(new TAccessEvent(dbName, TCatalogObjectType.DATABASE,
        privilege.toString()));

So with this line of change, one fewer access event will be logged. I 
guess/think this is not what we want.

Therefore, I am more inclined to do without this line of change in patch set 17.

However, without this change, we will have to think about what privilege 
request to register in getDb(String dbName, Privilege privilege). Note that 
with patch set 16, we would register the following.

 privReq = {PrivilegeRequest@9473}
  authorizable_ = {AuthorizableColumn@9474}
   columnName_ = "*"
   dbName_ = "test_db_10"
   tableName_ = "*"
   ownerUser_ = "admin"
   columns_ = {ArrayList@9475}  size = 0
  privilege_ = {Privilege@7676} "ANY"

This privilege could be thought of as a wildcard privilege request representing 
any column of any table under the database 'test_db_10', which clearly 
contradicts the following code comment in authorize(AuthorizationContext 
authzCtx, AnalysisResult analysisResult, FeCatalog catalog) of 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java
 because we do not register any respective wildcard privilege request for the 
table 'test_db_10.*'.

    // Authorize statements that may produce several hierarchical privilege 
requests.
    // Such a statement always has a corresponding table-level privilege 
request if it
    // has column-level privilege request.

We could not catch this with the following Preconditions check as discussed at 
https://gerrit.cloudera.org/c/23569/14/fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java#168.

          Preconditions.checkState((requests.isEmpty() ||
              !(privReq.getAuthorizable().getType() == 
Authorizable.Type.COLUMN)) ||
              (requests.get(0).getAuthorizable().getType() == 
Authorizable.Type.TABLE &&
                  requests.get(0).getPrivilege() == Privilege.SELECT));

In 
https://gerrit.cloudera.org/c/23569/18/fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java,
 I proposed a revised Preconditions check, and changed the registered privilege 
request to an ANY request on the database 'test_db_10'. Patch set 18 failed 
some existing authorization-related tests, but I will take a closer look to 
verify if there is some security-related issue about the proposed changes in 
patch set 18.


http://gerrit.cloudera.org:8080/#/c/23569/18/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/23569/18/tests/authorization/test_ranger.py@1521
PS18, Line 1521:     unique_table = unique_name + "_tbl"
> Is it really needed to use a random name? Can't we just use unique_database
Yeah. I think we could do as you suggested.


http://gerrit.cloudera.org:8080/#/c/23569/18/tests/authorization/test_ranger.py@1539
PS18, Line 1539:         admin_client.execute("drop database if exists {0} 
cascade"
               :             .format(unique_database))
> This shouldn't be needed as the name is unique
I guess we always assume that when we are running an end-to-end test that would 
create a 'unique_database', there is not already a database with the same name. 
In this case, we could remove this in the next patch set.


http://gerrit.cloudera.org:8080/#/c/23569/18/tests/authorization/test_ranger.py@1598
PS18, Line 1598:             TestRanger._remove_policies(unique_database, tbl, 
"*")
> Shouldn't we do this in the previous finally block, if the policy was creat
Yes we could move this to the previous finally block. I will address this in 
the next patch set.



--
To view, visit http://gerrit.cloudera.org:8080/23569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2ef61801d3b394c56702b193c250492a62b111df
Gerrit-Change-Number: 23569
Gerrit-PatchSet: 18
Gerrit-Owner: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Mon, 12 Jan 2026 22:53:57 +0000
Gerrit-HasComments: Yes

Reply via email to