[
https://issues.apache.org/jira/browse/IMPALA-10192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fang-Yu Rao reassigned IMPALA-10192:
------------------------------------
Assignee: Fang-Yu Rao
> IllegalStateException in processing column masking audit events
> ---------------------------------------------------------------
>
> Key: IMPALA-10192
> URL: https://issues.apache.org/jira/browse/IMPALA-10192
> Project: IMPALA
> Issue Type: Bug
> Reporter: Quanlong Huang
> Assignee: Fang-Yu Rao
> Priority: Blocker
> Attachments: Ranger_Policies_IMPALA-10192.json
>
>
> Users reported an IllegalStateException about column masking. I can reproduce
> it in the master branch:
> {code:java}
> I0925 21:42:09.684499 20809 jni-util.cc:288]
> ed44b3c5ca4a0e7d:8c4e884400000000] java.lang.IllegalStateException
> at
> com.google.common.base.Preconditions.checkState(Preconditions.java:492)
> at
> org.apache.impala.authorization.ranger.RangerAuthorizationContext.stashAuditEvents(RangerAuthorizationContext.java:71)
> at
> org.apache.impala.authorization.ranger.RangerAuthorizationChecker.postAnalyze(RangerAuthorizationChecker.java:373)
> at
> org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:440)
> at
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1562)
> at
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1529)
> at
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1499)
> at
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:162)
> {code}
> It happens when there are several column masking policies on a table and not
> all of them are applied on the current user, i.e. some masking policies exist
> and apply on other users. Then if the current user query the table, the error
> occurs.
> *Reproducing*
> Start Impala cluster with Ranger authz enabled
> {code:java}
> bin/start-impala-cluster.py --impalad_args="--server-name=server1
> --ranger_service_type=hive --ranger_app_id=impala
> --authorization_provider=ranger" --catalogd_args="--server-name=server1
> --ranger_service_type=hive --ranger_app_id=impala
> --authorization_provider=ranger"
> {code}
> Create a tmp table using your username.
> {code:java}
> $ bin/impala-shell.sh
> [localhost:21050] default> create table tmp_tbl (id int, name string) stored
> as parquet;
> {code}
> Open the Ranger WebUI at [http://localhost:6080/]. Add two column masking
> policies:
> * Masking default.tmp_tbl.id using HASH for user "non_owner"
> * Masking default.tmp_tbl.name using REDACT for your username (quanlong in
> my case)
> Refresh the policies in impala and query the table using your username.
> {code:java}
> bin/impala-shell.sh -u admin -q "refresh authorization"
> bin/impala-shell.sh -q "select * from tmp_tbl"
> {code}
> The last query will fail with "ERROR: IllegalStateException: null".
> The policy file is attached.
> *Clues*
> In RangerAuthorizationContext.stashAuditEvents(), we deduplicate the column
> masking audit events. There is a Precondition check that all events generated
> are column masking events:
>
> [https://github.com/apache/impala/blob/5c69e7ba583297dc886652ac5952816882b928af/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java#L71]
> Codes:
> {code:java}
> public void stashAuditEvents(RangerImpalaPlugin plugin) {
> Set<String> unfilteredMaskNames = plugin.getUnfilteredMaskNames(
> Arrays.asList("MASK_NONE"));
> for (AuthzAuditEvent event : auditHandler_.getAuthzEvents()) {
> // We assume that all the logged events until now are column
> masking-related. Since
> // we remove those AuthzAuditEvent's corresponding to the "Unmasked"
> policy of type
> // "MASK_NONE", we exclude this type of mask.
> Preconditions.checkState(unfilteredMaskNames
> .contains(event.getAccessType().toUpperCase()));
> // event.getEventKey() is the concatenation of the following fields in
> an
> // AuthzAuditEvent: 'user', 'accessType', 'resourcePath',
> 'resourceType', 'action',
> // 'accessResult', 'sessionId', and 'clientIP'. Recall that
> 'resourcePath' is the
> // concatenation of 'dbName', 'tableName', and 'columnName' that were
> used to
> // instantiate a RangerAccessResourceImpl in order to create a
> RangerAccessRequest
> // to call RangerImpalaPlugin#evalDataMaskPolicies(). Refer to
> // RangerAuthorizationChecker#evalColumnMask() for further details.
> deduplicatedAuditEvents_.put(event.getEventKey(), event);
> }
> auditHandler_.getAuthzEvents().clear();
> }
> {code}
> However, it's possible that some SELECT events are generated during the
> analyzing phase at here:
>
> [https://github.com/apache/impala/blob/5c69e7ba583297dc886652ac5952816882b928af/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java#L308]
> Looks like if there is a column masking policy on a column and the policy
> doesn't target to the current user, Ranger plugin will generate a SELECT
> audit event. In this case, the first masking policy is on "id" column for
> user "non_owner". Then we get a SELECT event on this column. The second
> masking policy is on "name" column for the current user. We get a mask event
> as we expected.
> We should deal with these non mask events correctly. On the other hand, we
> should replace all Precondition checks on the audit code paths with error
> loggings, since these should not fail a query.
> cc [~fangyurao]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]