[ 
https://issues.apache.org/jira/browse/IMPALA-10192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-10192:
------------------------------------
    Priority: Blocker  (was: Major)

> IllegalStateException in processing column masking audit events
> ---------------------------------------------------------------
>
>                 Key: IMPALA-10192
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10192
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Quanlong Huang
>            Priority: Blocker
>
> Users reported an IllegalStateException about column masking. I can reproduce 
> it in the master branch:
> {code:java}
> I0925 21:42:09.684499 20809 jni-util.cc:288] 
> ed44b3c5ca4a0e7d:8c4e884400000000] java.lang.IllegalStateException
>         at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:492)
>         at 
> org.apache.impala.authorization.ranger.RangerAuthorizationContext.stashAuditEvents(RangerAuthorizationContext.java:71)
>         at 
> org.apache.impala.authorization.ranger.RangerAuthorizationChecker.postAnalyze(RangerAuthorizationChecker.java:373)
>         at 
> org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:440)
>         at 
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1562)
>         at 
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1529)
>         at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1499)
>         at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:162)
> {code}
> *Reproducing*
>  Start Impala cluster with Ranger authz enabled
> {code:java}
> bin/start-impala-cluster.py --impalad_args="--server-name=server1 
> --ranger_service_type=hive --ranger_app_id=impala 
> --authorization_provider=ranger" --catalogd_args="--server-name=server1 
> --ranger_service_type=hive --ranger_app_id=impala 
> --authorization_provider=ranger"
> {code}
> Create a tmp table using your username.
> {code:java}
> $ bin/impala-shell.sh
> [localhost:21050] default> create table tmp_tbl (id int, name string) stored 
> as parquet;
> {code}
> Open the Ranger WebUI at [http://localhost:6080/]. Add two column masking 
> policies
>  * Masking default.tmp_tbl.id using HASH for user "non_owner"
>  * Masking default.tmp_tbl.name using REDACT for your username (quanlong in 
> my case)
> Refresh the policies in impala and query the table using your username.
> {code:java}
> bin/impala-shell.sh -u admin -q "refresh authorization"
> bin/impala-shell.sh -q "select * from tmp_tbl"
> {code}
> The last query will fail with "ERROR: IllegalStateException: null".
> *Clues*
> In RangerAuthorizationContext.stashAuditEvents(), we deduplicate the column 
> masking audit events. There is a Precondition check that all events generated 
> are column masking events:
>  
> [https://github.com/apache/impala/blob/5c69e7ba583297dc886652ac5952816882b928af/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java#L71]
>  Codes:
> {code:java}
>   public void stashAuditEvents(RangerImpalaPlugin plugin) {
>     Set<String> unfilteredMaskNames = plugin.getUnfilteredMaskNames(
>         Arrays.asList("MASK_NONE"));
>     for (AuthzAuditEvent event : auditHandler_.getAuthzEvents()) {
>       // We assume that all the logged events until now are column 
> masking-related. Since
>       // we remove those AuthzAuditEvent's corresponding to the "Unmasked" 
> policy of type
>       // "MASK_NONE", we exclude this type of mask.
>       Preconditions.checkState(unfilteredMaskNames
>           .contains(event.getAccessType().toUpperCase()));
>       // event.getEventKey() is the concatenation of the following fields in 
> an
>       // AuthzAuditEvent: 'user', 'accessType', 'resourcePath', 
> 'resourceType', 'action',
>       // 'accessResult', 'sessionId', and 'clientIP'. Recall that 
> 'resourcePath' is the
>       // concatenation of 'dbName', 'tableName', and 'columnName' that were 
> used to
>       // instantiate a RangerAccessResourceImpl in order to create a 
> RangerAccessRequest
>       // to call RangerImpalaPlugin#evalDataMaskPolicies(). Refer to
>       // RangerAuthorizationChecker#evalColumnMask() for further details.
>       deduplicatedAuditEvents_.put(event.getEventKey(), event);
>     }
>     auditHandler_.getAuthzEvents().clear();
>   }
> {code}
> However, it's possible that some SELECT events are generated during the 
> analyzing phase at here:
>  
> [https://github.com/apache/impala/blob/5c69e7ba583297dc886652ac5952816882b928af/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java#L308]
>  Looks like if there is a column masking policy on a column and the policy 
> doesn't target to the current user, Ranger plugin will generate a SELECT 
> audit event. In this case, the first masking policy is on "id" column for 
> user "non_owner". Then we get a SELECT event on this column. The second 
> masking policy is on "name" column for the current user. We get a mask event 
> as we expected.
> We should deal with these non mask events correctly. On the other hand, we 
> should replace all Precondition checks on the audit code paths with error 
> loggings, since these should not fail a query.
> cc [~fangyurao]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to