Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19194 )

Change subject: IMPALA-10986: Require the SELECT privilege to execute a UDF
......................................................................


Patch Set 15:

(6 comments)

Thanks for the suggestions Qifan! I will prepare a follow-up patch as an 
addendum to this JIRA to address some of the comments in the review.

http://gerrit.cloudera.org:8080/#/c/19194/15//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19194/15//COMMIT_MSG@7
PS15, Line 7: SELECT
> It seems the proper keyword should be EXECUTE or USE.  SELECT may imply the
Thanks Qifan!

That's a good question and I did not actually check what privileges are 
required in Impala or Hive if we are using a UDF in an INSERT statement, e.g., 
"INSERT INTO test_db_02.test_tbl_01 VALUES (test_db_01.identity(1), "Adam")".

I just briefly checked in a cluster where the Hive shipped with Apache Impala 
was deployed and found that for an INSERT statement as described above, the 
SELECT privilege on the UDF is still required (v.s. the EXECUTE privilege).

0: jdbc:hive2://nightly-7x-jz-1.nightly-7x-jz> insert into 
test_db_02.test_tbl_01 values (test_db_01.identity(2), "Alex");
Error: Error while compiling statement: FAILED: HiveAccessControlException 
Permission denied: user [livy] does not have [SELECT] privilege on 
[test_db_01/identity] (state=42000,code=40000)

On the other hand, I have also verified that with this patch, Impala requires 
the following 2 privileges to execute a UDF in an INSERT statement. The reason 
why those 2 privileges are still registered in query analysis is that under the 
covers SelectStmt#analyze() is still called for an INSERT statement.

1. The SELECT privilege on the UDF.
2. One of the REFRESH, INSERT, SELECT privileges on all the tables and columns 
in the database where the UDF belongs.

One of the purposes of this JIRA is to align Impala with Hive regarding 
authorizing a query that attempts to execute a UDF. Therefore to introduce the 
EXECUTE privilege on UDF's, it may be better to also change the required 
privilege(s) in Hive. The code that makes the authorization decisions resides 
in the Ranger repository. I guess it may take some time for people to make the 
corresponding changes in both Hive and Ranger in order to introduce the EXECUTE 
privilege on UDF's.

At the moment what I could do is to add additional test cases involving UDF 
execution in an INSERT statement as an addendum to this patch.


http://gerrit.cloudera.org:8080/#/c/19194/15/fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java
File fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java:

http://gerrit.cloudera.org:8080/#/c/19194/15/fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java@347
PS15, Line 347: dbName_ = path.get(0);
              :       functionName_.setDb(path.get(0));
              :       functionName_.setFunction(path.get(1));
> nit. May just use "*" at RHS or in arguments for readability.
Hm, I am a bit confused here.

Since we would like to deal with the cases of a) `*`.`*`, b) <db_name>.`*`, and 
c) `*`.<fn_name> where <db_name> and <fn_name> are not `*`, simply using "*" at 
the right-hand side will incorrectly set the database name in the case of 
<db_name>.`*` and the function name in the case of `*`.<fn_name>.


http://gerrit.cloudera.org:8080/#/c/19194/15/fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java@358
PS15, Line 358: // Need to set up 'dbName_', which in turn is used to set up 
'db_name' of the
              :     // TPrivilege in createTPrivilege()
> nit. Suggest to move to line 343 to cover both *.* and non *.* cases.
Thanks Qifan!

It seems that we could not move "dbName_ = 
analyzer.getTargetDbName(functionName_)" to line 343 without additional changes 
because for a 'functionName_' not involving any wildcard, we need to make sure 
that 'db_' has been initialized/resolved. However, the field of 'db_' is not 
initialized/resolved at line 343.

Of course one may ask whether it's possible to call 
"functionName_.analyze(analyzer, /* preferBuiltinsDb */ false)" at line 343. 
It's possible but in that case, we will have to change FunctionName#analyze() 
to take into consideration function names in wildcard, which will need more 
work. We may create a follow-up JIRA if we really want to make the 
corresponding code cleaner.


http://gerrit.cloudera.org:8080/#/c/19194/15/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
File 
fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java:

http://gerrit.cloudera.org:8080/#/c/19194/15/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@579
PS15, Line 579:   .error(accessError("functional"),
              :               onUdf("functional", "f", TPrivilegeLevel.SELECT))
> should error() be ok() here, since functional.f is automatically SELECTable
Thanks Qifan!

That is because to execute a UDF in a query, the requesting user still needs 
one of the INSERT, REFRESH, SELECT privileges on all the tables and columns in 
the database where the UDF belongs.

Since the requesting user is only granted the SELECT privilege on the UDF, the 
query is not authorized.


http://gerrit.cloudera.org:8080/#/c/19194/15/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/19194/15/tests/authorization/test_ranger.py@482
PS15, Line 482: a wildcard for functional name
> nit. functional name in wildcard
Thanks Qifan!

I will prepare a follow-up patch as an addendum to address this suggestion.


http://gerrit.cloudera.org:8080/#/c/19194/15/tests/authorization/test_ranger.py@521
PS15, Line 521: # Revoke the granted privilege and verify.
              :       admin_client.execute("revoke select on user_defined_fn 
{0}.{1} from {2} {3}"
              :                            .format(unique_database, udf, kw, 
id))
              :       result = self.client.execute("show grant {0} {1} on 
user_defined_fn {2}.{3}"
              :                                    .format(kw, id, 
unique_database, udf))
              :       TestRanger._check_privileges(result, [])
> Repetition of code  from line 483 to 526.
Thanks Qifan!

Indeed, I will prepare a follow-up patch as an addendum to address the comment.



--
To view, visit http://gerrit.cloudera.org:8080/19194
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e58ba30545ce169786aac279b00c8f6e09ae740
Gerrit-Change-Number: 19194
Gerrit-PatchSet: 15
Gerrit-Owner: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Mon, 12 Dec 2022 07:55:32 +0000
Gerrit-HasComments: Yes

Reply via email to