Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19194 )

Change subject: IMPALA-10986: Require the SELECT privilege to execute a UDF
......................................................................


Patch Set 7:

(5 comments)

Hi all, I have provided the responses to Csaba's questions in the previous 
iteration. Sorry about the lengthy responses. It is not straightforward to 
answer them. It took me a while to look for the answer to each question.

Let me know if you have any addition suggestion or comment. Thank you very much 
for the help!

http://gerrit.cloudera.org:8080/#/c/19194/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19194/7//COMMIT_MSG@7
PS7, Line 7: execute
> Does this also affect create/drop function? What privilege is needed in tha
Thanks Csaba!

This patch does not affect the required privileges for CREATE/DROP FUNCTION and 
SHOW FUNCTIONS.

For a CREATE FUNCTION query like "create function test_db.identity(bigint) 
returns bigint location '/test-warehouse/impala-hive-udfs.jar' 
symbol='org.apache.impala.TestUdf'", the requesting user has to have a) CREATE 
privilege on 'identity' and b) ALL privilege on the URL/URI 
"/test-warehouse/impala-hive-udfs.jar". The CREATE privilege is registered in 
CreateFunctionStmtBase#analyze() at 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/CreateFunctionStmtBase.java#L147.

For a DROP FUNCTION query, the requesting user has to have the DROP privilege 
on 'identity', which is registered in DropFunctionStmt#analyze() at 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/DropFunctionStmt.java#L110.

For a SHOW FUNCTIONS query, the requesting user has to have the ANY privilege 
(any one in ALL, OWNER, ALTER, DROP, CREATE, INSERT, SELECT, REFRESH) on any 
column in any table in the database, which is registered in 
ShowFunctionsStmt#analyze() at 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/ShowFunctionsStmt.java#L83.
 "Any column" access is special if Ranger is the authorization provider. Refer 
to 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java#L131-L134
 for more details.


http://gerrit.cloudera.org:8080/#/c/19194/7//COMMIT_MSG@16
PS7, Line 16: After this patch, the user has to be granted the SELECT privilege 
on the
            : UDF as well to execute the UDF.
> Hive works like this since a long time, right? Can you mention this in the
According to my current understanding, Hive does not directly decides what 
privilege a requesting user has to be granted to execute a UDF. Such a decision 
is made by Ranger if Ranger is the authorization provider. Hive calls 
ss.getAuthorizerV2().checkPrivileges() at 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/command/CommandAuthorizerV2.java#L83,
 which in turn calls RangerHiveAuthorizer#checkPrivileges() in Ranger's plugin.

checkPrivileges() is where Ranger decides whether the requesting user is 
allowed to access the corresponding resources in a query. To check whether the 
requesting user is authorized to SELECT a non-column resource, 
hivePlugin.isAccessAllowed(request, auditHandler) at 
https://github.com/apache/ranger/blob/master/hive-agent/src/main/java/org/apache/ranger/authorization/hive/authorizer/RangerHiveAuthorizer.java#L1053
 is invoked.

Before hivePlugin.isAccessAllowed(), RangerHiveAuthorizer#getHiveResource() at 
https://github.com/apache/ranger/blob/master/hive-agent/src/main/java/org/apache/ranger/authorization/hive/authorizer/RangerHiveAuthorizer.java#L1531-L1546
 decided what resource to check to execute a UDF in a SELECT query, which would 
be <db_name>.<udf_name> in our case here.

As to when Ranger started to behave this way, I found that some corresponding 
code was already there in RANGER-194. Refer to 
https://issues.apache.org/jira/secure/attachment/12686721/0001-RANGER-194-Rename-packages-classes-to-apache-ranger-.patch
 for further details. Specifically, look for the method 
HiveAuthDB#isAccessAllowed() where isUDFAccessAllowed() is called. This shows 
that Ranger explicitly required the privilege on the UDF in Hive some time 
around 2015.


http://gerrit.cloudera.org:8080/#/c/19194/7//COMMIT_MSG@24
PS7, Line 24: GRANT SELECT ON USER_DEFINED_FN <db_name>.<udf_name> TO USER 
<user_name>
> Are the privileges created this way compatible with Hive, so enable/disable
As mentioned above, the authorization decision is mainly made by Ranger if 
Ranger is the authorization provider in Hive.

What this command does is to add the corresponding policy in Ranger's policy 
repository so that the user <user_name> will be allowed to execute the UDF 
<db_name>.<udf_name> in a SELECT query. Of course we could also add such a 
policy manually via Ranger's web UI. Without such a policy being created, in 
Hive's beeline, there will be an authorization exception like the following.

0: jdbc:hive2://nightly-7x-jy-3.nightly-7x-jy> select <db_name>.<udf_name>(17);
Error: Error while compiling statement: FAILED: HiveAccessControlException 
Permission denied: user [<user_name>] does not have [SELECT] privilege on 
[<db_name>/<udf_name>] (state=42000,code=40000)

I have to mention that Hive currently does not support GRANT/REVOKE command for 
privileges on UDF's because we could not find UDF as one of the possible 
privilegeObject's. Refer to Hive's parser definition at 
https://github.com/apache/hive/blob/master/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g#L1500-L1519.
 That means currently a Hive administrator can only manage privileges on UDF's 
via Ranger's web UI or Ranger's REST API's.


http://gerrit.cloudera.org:8080/#/c/19194/7//COMMIT_MSG@34
PS7, Line 34:
> Are there OWNER privileges on UDFs? Does Hive (or after this patch Impala)
I have verified that in Cloudera's distribution of Hive (the one we provide for 
Apache Impala development), a user, e.g., livy, still has to be explicitly 
granted the SELECT privilege on a UDF to execute the UDF even though the user 
created that UDF. Otherwise we would still see the following error message. It 
looks the OWNER privileges on UDF's are not supported in Cloudera's 
distribution of Hive.

0: jdbc:hive2://nightly-7x-jy-3.nightly-7x-jy> select test_db_02.identity(13);
Error: Error while compiling statement: FAILED: HiveAccessControlException 
Permission denied: user [livy] does not have [SELECT] privilege on 
[test_db_02/identity] (state=42000,code=40000)

Moreover, the OWNER privileges on UDF's are not supported in Impala after this 
patch. A user still has to be explicitly granted the SELECT privilege on a UDF 
to execute the UDF even though the user created that UDF. According to 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/Authorizable.java#L63-L65,
 ownership is applicable only for database and table objects.

To be more specific, when creating an instance of TCreateFunctionParams in 
CreateFunctionStmtBase#toThrift() at 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/CreateFunctionStmtBase.java#L85,
 we do not set the field of owner as we do in CreateTableStmt#toThrift() at 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java#L210.
 This indicates that Impala does not pass the ownership information of UDF to 
Hive metastore when creating the Thrift request to create the UDF. Hence I 
think Impala is not able to know the owner user of a function when creating the 
AuthorizableFn that will be used to create the corresponding Ranger resource 
later in RangerAuthorizationChecker#authorizeResource() at 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java#L150l-L154.
 As a result, Ranger won't be able to know the owner of the UDF in the query.

It is not immediately clear to me whether Impala could just simply create a 
field of owner in TCreateFunctionParams without the corresponding change in 
Hive metastore.


http://gerrit.cloudera.org:8080/#/c/19194/7/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
File 
fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java:

http://gerrit.cloudera.org:8080/#/c/19194/7/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@572
PS7, Line 572: onUdf
> Can you add a positive case when there is an onUdf privilege for f but no d
I am a bit confused here.

If the requesting user is granted the SELECT privilege on the UDF in a SELECT 
query but not granted the SELECT privilege on the database, then such a query 
will be rejected. That is, we could not add a positive test case.

In addition, the (negative) test case like this was already added. It is the 
immediately preceding test case as shown in the following.

          .error(accessError("functional"),
              onUdf("functional", "f", TPrivilegeLevel.SELECT))

Did I misunderstand something here?



--
To view, visit http://gerrit.cloudera.org:8080/19194
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e58ba30545ce169786aac279b00c8f6e09ae740
Gerrit-Change-Number: 19194
Gerrit-PatchSet: 7
Gerrit-Owner: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Wed, 23 Nov 2022 08:16:40 +0000
Gerrit-HasComments: Yes

Reply via email to