[
https://issues.apache.org/jira/browse/IMPALA-10192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210506#comment-17210506
]
ASF subversion and git services commented on IMPALA-10192:
----------------------------------------------------------
Commit cd51d031188ddd805042b8df72eb8ef59496a546 in impala's branch
refs/heads/master from Fang-Yu Rao
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=cd51d03 ]
IMPALA-10192: Filter out redundant AuthzAuditEvent's for column masking
We found that Ranger would generate an AuthzAuditEvent as long as
there exists a column masking policy corresponding to the column
even though the policy does not apply to the requesting user. This
resulted in an IllegalStateException if a user "A" submits a SELECT
query against a table that has a column specified in a column masking
policy when the policy does not apply to "A", i.e., the field of
'Select User' for this policy in the Ranger web UI does not contain "A".
For such an AuthzAuditEvent, its field of 'accessType' will not be one
of the supported mask types since its corresponding
accessResult.isMaskEnabled() would evaluates to false, indicating that
there is no matching column masking policy associated with the user "A"
and thus the AuthzAuditEvent will not be post-processed by Impala in
RangerAuthorizationCheker#createColumnMask(). But since we did not
filter out such an AuthzAuditEvent when it was generated and returned
from RangerBasePlugin#evalDataMaskPolicies(), we failed the check that
requires every AuthzAuditEvent be column masking-related in
RangerAuthorizationContext#stashAuditEvents().
To address this issue, in this patch we filter out such an
AuthzAuditEvent after each call to
RangerBasePlugin#evalDataMaskPolicies() so that no redundant
AuthzAuditEvent is generated.
Testing:
- Added a new column masking policy associated with a non-matching user
in RangerAuditLogTest#testAuditsForColumnMasking() to verify that
the redundant AuthzAuditEvent is removed.
- Verified that the patch passes the exhaustive tests in the DEBUG
build.
Change-Id: I1dbf65874003523b5176680e42f26fa2114c229b
Reviewed-on: http://gerrit.cloudera.org:8080/16524
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> IllegalStateException in processing column masking audit events
> ---------------------------------------------------------------
>
> Key: IMPALA-10192
> URL: https://issues.apache.org/jira/browse/IMPALA-10192
> Project: IMPALA
> Issue Type: Bug
> Reporter: Quanlong Huang
> Assignee: Fang-Yu Rao
> Priority: Blocker
> Attachments: Ranger_Policies_IMPALA-10192.json
>
>
> Users reported an IllegalStateException about column masking. I can reproduce
> it in the master branch:
> {code:java}
> I0925 21:42:09.684499 20809 jni-util.cc:288]
> ed44b3c5ca4a0e7d:8c4e884400000000] java.lang.IllegalStateException
> at
> com.google.common.base.Preconditions.checkState(Preconditions.java:492)
> at
> org.apache.impala.authorization.ranger.RangerAuthorizationContext.stashAuditEvents(RangerAuthorizationContext.java:71)
> at
> org.apache.impala.authorization.ranger.RangerAuthorizationChecker.postAnalyze(RangerAuthorizationChecker.java:373)
> at
> org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:440)
> at
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1562)
> at
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1529)
> at
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1499)
> at
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:162)
> {code}
> It happens when there are several column masking policies on a table and not
> all of them are applied on the current user, i.e. some masking policies exist
> and apply on other users. Then if the current user query the table, the error
> occurs.
> *Reproducing*
> Start Impala cluster with Ranger authz enabled
> {code:java}
> bin/start-impala-cluster.py --impalad_args="--server-name=server1
> --ranger_service_type=hive --ranger_app_id=impala
> --authorization_provider=ranger" --catalogd_args="--server-name=server1
> --ranger_service_type=hive --ranger_app_id=impala
> --authorization_provider=ranger"
> {code}
> Create a tmp table using your username.
> {code:java}
> $ bin/impala-shell.sh
> [localhost:21050] default> create table tmp_tbl (id int, name string) stored
> as parquet;
> {code}
> Open the Ranger WebUI at [http://localhost:6080/]. Add two column masking
> policies:
> * Masking default.tmp_tbl.id using HASH for user "non_owner"
> * Masking default.tmp_tbl.name using REDACT for your username (quanlong in
> my case)
> Refresh the policies in impala and query the table using your username.
> {code:java}
> bin/impala-shell.sh -u admin -q "refresh authorization"
> bin/impala-shell.sh -q "select * from tmp_tbl"
> {code}
> The last query will fail with "ERROR: IllegalStateException: null".
> The policy file is attached.
> *Clues*
> In RangerAuthorizationContext.stashAuditEvents(), we deduplicate the column
> masking audit events. There is a Precondition check that all events generated
> are column masking events:
>
> [https://github.com/apache/impala/blob/5c69e7ba583297dc886652ac5952816882b928af/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java#L71]
> Codes:
> {code:java}
> public void stashAuditEvents(RangerImpalaPlugin plugin) {
> Set<String> unfilteredMaskNames = plugin.getUnfilteredMaskNames(
> Arrays.asList("MASK_NONE"));
> for (AuthzAuditEvent event : auditHandler_.getAuthzEvents()) {
> // We assume that all the logged events until now are column
> masking-related. Since
> // we remove those AuthzAuditEvent's corresponding to the "Unmasked"
> policy of type
> // "MASK_NONE", we exclude this type of mask.
> Preconditions.checkState(unfilteredMaskNames
> .contains(event.getAccessType().toUpperCase()));
> // event.getEventKey() is the concatenation of the following fields in
> an
> // AuthzAuditEvent: 'user', 'accessType', 'resourcePath',
> 'resourceType', 'action',
> // 'accessResult', 'sessionId', and 'clientIP'. Recall that
> 'resourcePath' is the
> // concatenation of 'dbName', 'tableName', and 'columnName' that were
> used to
> // instantiate a RangerAccessResourceImpl in order to create a
> RangerAccessRequest
> // to call RangerImpalaPlugin#evalDataMaskPolicies(). Refer to
> // RangerAuthorizationChecker#evalColumnMask() for further details.
> deduplicatedAuditEvents_.put(event.getEventKey(), event);
> }
> auditHandler_.getAuthzEvents().clear();
> }
> {code}
> However, it's possible that some SELECT events are generated during the
> analyzing phase at here:
>
> [https://github.com/apache/impala/blob/5c69e7ba583297dc886652ac5952816882b928af/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java#L308]
> Looks like if there is a column masking policy on a column and the policy
> doesn't target to the current user, Ranger plugin will generate a SELECT
> audit event. In this case, the first masking policy is on "id" column for
> user "non_owner". Then we get a SELECT event on this column. The second
> masking policy is on "name" column for the current user. We get a mask event
> as we expected.
> We should deal with these non mask events correctly. On the other hand, we
> should replace all Precondition checks on the audit code paths with error
> loggings, since these should not fail a query.
> cc [~fangyurao]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]