Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/19194 )
Change subject: IMPALA-10986: Require the SELECT privilege to execute a UDF ...................................................................... Patch Set 7: (5 comments) Hi all, I have provided the responses to Csaba's questions in the previous iteration. Sorry about the lengthy responses. It is not straightforward to answer them. It took me a while to look for the answer to each question. Let me know if you have any addition suggestion or comment. Thank you very much for the help! http://gerrit.cloudera.org:8080/#/c/19194/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19194/7//COMMIT_MSG@7 PS7, Line 7: execute > Does this also affect create/drop function? What privilege is needed in tha Thanks Csaba! This patch does not affect the required privileges for CREATE/DROP FUNCTION and SHOW FUNCTIONS. For a CREATE FUNCTION query like "create function test_db.identity(bigint) returns bigint location '/test-warehouse/impala-hive-udfs.jar' symbol='org.apache.impala.TestUdf'", the requesting user has to have a) CREATE privilege on 'identity' and b) ALL privilege on the URL/URI "/test-warehouse/impala-hive-udfs.jar". The CREATE privilege is registered in CreateFunctionStmtBase#analyze() at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/CreateFunctionStmtBase.java#L147. For a DROP FUNCTION query, the requesting user has to have the DROP privilege on 'identity', which is registered in DropFunctionStmt#analyze() at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/DropFunctionStmt.java#L110. For a SHOW FUNCTIONS query, the requesting user has to have the ANY privilege (any one in ALL, OWNER, ALTER, DROP, CREATE, INSERT, SELECT, REFRESH) on any column in any table in the database, which is registered in ShowFunctionsStmt#analyze() at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/ShowFunctionsStmt.java#L83. "Any column" access is special if Ranger is the authorization provider. Refer to https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java#L131-L134 for more details. http://gerrit.cloudera.org:8080/#/c/19194/7//COMMIT_MSG@16 PS7, Line 16: After this patch, the user has to be granted the SELECT privilege on the : UDF as well to execute the UDF. > Hive works like this since a long time, right? Can you mention this in the According to my current understanding, Hive does not directly decides what privilege a requesting user has to be granted to execute a UDF. Such a decision is made by Ranger if Ranger is the authorization provider. Hive calls ss.getAuthorizerV2().checkPrivileges() at https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/command/CommandAuthorizerV2.java#L83, which in turn calls RangerHiveAuthorizer#checkPrivileges() in Ranger's plugin. checkPrivileges() is where Ranger decides whether the requesting user is allowed to access the corresponding resources in a query. To check whether the requesting user is authorized to SELECT a non-column resource, hivePlugin.isAccessAllowed(request, auditHandler) at https://github.com/apache/ranger/blob/master/hive-agent/src/main/java/org/apache/ranger/authorization/hive/authorizer/RangerHiveAuthorizer.java#L1053 is invoked. Before hivePlugin.isAccessAllowed(), RangerHiveAuthorizer#getHiveResource() at https://github.com/apache/ranger/blob/master/hive-agent/src/main/java/org/apache/ranger/authorization/hive/authorizer/RangerHiveAuthorizer.java#L1531-L1546 decided what resource to check to execute a UDF in a SELECT query, which would be <db_name>.<udf_name> in our case here. As to when Ranger started to behave this way, I found that some corresponding code was already there in RANGER-194. Refer to https://issues.apache.org/jira/secure/attachment/12686721/0001-RANGER-194-Rename-packages-classes-to-apache-ranger-.patch for further details. Specifically, look for the method HiveAuthDB#isAccessAllowed() where isUDFAccessAllowed() is called. This shows that Ranger explicitly required the privilege on the UDF in Hive some time around 2015. http://gerrit.cloudera.org:8080/#/c/19194/7//COMMIT_MSG@24 PS7, Line 24: GRANT SELECT ON USER_DEFINED_FN <db_name>.<udf_name> TO USER <user_name> > Are the privileges created this way compatible with Hive, so enable/disable As mentioned above, the authorization decision is mainly made by Ranger if Ranger is the authorization provider in Hive. What this command does is to add the corresponding policy in Ranger's policy repository so that the user <user_name> will be allowed to execute the UDF <db_name>.<udf_name> in a SELECT query. Of course we could also add such a policy manually via Ranger's web UI. Without such a policy being created, in Hive's beeline, there will be an authorization exception like the following. 0: jdbc:hive2://nightly-7x-jy-3.nightly-7x-jy> select <db_name>.<udf_name>(17); Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [<user_name>] does not have [SELECT] privilege on [<db_name>/<udf_name>] (state=42000,code=40000) I have to mention that Hive currently does not support GRANT/REVOKE command for privileges on UDF's because we could not find UDF as one of the possible privilegeObject's. Refer to Hive's parser definition at https://github.com/apache/hive/blob/master/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g#L1500-L1519. That means currently a Hive administrator can only manage privileges on UDF's via Ranger's web UI or Ranger's REST API's. http://gerrit.cloudera.org:8080/#/c/19194/7//COMMIT_MSG@34 PS7, Line 34: > Are there OWNER privileges on UDFs? Does Hive (or after this patch Impala) I have verified that in Cloudera's distribution of Hive (the one we provide for Apache Impala development), a user, e.g., livy, still has to be explicitly granted the SELECT privilege on a UDF to execute the UDF even though the user created that UDF. Otherwise we would still see the following error message. It looks the OWNER privileges on UDF's are not supported in Cloudera's distribution of Hive. 0: jdbc:hive2://nightly-7x-jy-3.nightly-7x-jy> select test_db_02.identity(13); Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [livy] does not have [SELECT] privilege on [test_db_02/identity] (state=42000,code=40000) Moreover, the OWNER privileges on UDF's are not supported in Impala after this patch. A user still has to be explicitly granted the SELECT privilege on a UDF to execute the UDF even though the user created that UDF. According to https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/Authorizable.java#L63-L65, ownership is applicable only for database and table objects. To be more specific, when creating an instance of TCreateFunctionParams in CreateFunctionStmtBase#toThrift() at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/CreateFunctionStmtBase.java#L85, we do not set the field of owner as we do in CreateTableStmt#toThrift() at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java#L210. This indicates that Impala does not pass the ownership information of UDF to Hive metastore when creating the Thrift request to create the UDF. Hence I think Impala is not able to know the owner user of a function when creating the AuthorizableFn that will be used to create the corresponding Ranger resource later in RangerAuthorizationChecker#authorizeResource() at https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java#L150l-L154. As a result, Ranger won't be able to know the owner of the UDF in the query. It is not immediately clear to me whether Impala could just simply create a field of owner in TCreateFunctionParams without the corresponding change in Hive metastore. http://gerrit.cloudera.org:8080/#/c/19194/7/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java File fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java: http://gerrit.cloudera.org:8080/#/c/19194/7/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@572 PS7, Line 572: onUdf > Can you add a positive case when there is an onUdf privilege for f but no d I am a bit confused here. If the requesting user is granted the SELECT privilege on the UDF in a SELECT query but not granted the SELECT privilege on the database, then such a query will be rejected. That is, we could not add a positive test case. In addition, the (negative) test case like this was already added. It is the immediately preceding test case as shown in the following. .error(accessError("functional"), onUdf("functional", "f", TPrivilegeLevel.SELECT)) Did I misunderstand something here? -- To view, visit http://gerrit.cloudera.org:8080/19194 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5e58ba30545ce169786aac279b00c8f6e09ae740 Gerrit-Change-Number: 19194 Gerrit-PatchSet: 7 Gerrit-Owner: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Wed, 23 Nov 2022 08:16:40 +0000 Gerrit-HasComments: Yes
