[
https://issues.apache.org/jira/browse/IMPALA-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117188#comment-17117188
]
ASF subversion and git services commented on IMPALA-9597:
---------------------------------------------------------
Commit 5c69e7ba583297dc886652ac5952816882b928af in impala's branch
refs/heads/master from Fang-Yu Rao
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5c69e7b ]
IMPALA-9597: Eliminate redundant Ranger audits for column masking
After IMPALA-9350, Impala is able to produce the corresponding Ranger
audits when a query involves policies of column masking. However,
redundant audit events could be produced due to the fact that the
analysis of the TableRef containing a column involved in a column
masking policy could be performed more than once for a query that has
to be analyzed more than once. For example, a query consisting of a
WithClause or a query that requires a rewrite operation followed by a
re-analysis phase would result in
RangerImpalaPlugin#evalDataMaskPolicies() being invoked multiple times,
each producing an audit log entry for the same column.
Moreover, for a query involving column masking policies, the
corresponding audit log entries will still be generated even though
there is an AuthorizationException thrown in the authorization phase.
This patch fixes those two issues described above by adding some
post-processing steps after the analysis of a query to deduplicate the
List of AuthzAuditEvent's for column masking policies. Specifically,
we stash the deduplicated audit events after the analysis of the query
and will add back those deduplicated events only if the authorization of
the query is successful.
On the other hand, this patch also resolves an inconsistency when an
"Unmasked" policy is involved in a query that retains the original
column value. Specifically, when an "Unmasked" policy is the only column
masking policy involved in this query,
RangerAuthorizationChecker#createColumnMask() will not be called to
produce the corresponding AuthzAuditEvent, whereas createColumnMask()
will be invoked to produce the respective AuthzAuditEvent if there are
policies of other types. Since an "Unmasked" policy essentially
does not change the original column value, we filter out the respective
events with mask type equal to "MASK_NONE" which corresponds to an
"Unmasked" policy.
Testing:
- Added three test cases in
RangerAuditLogTest#testAuditsForColumnMasking() to make sure the
issues above are resolved.
- Verified that this patch passes the FE tests in the DEBUG build.
Change-Id: I42d60130fba93d63fbc36949f2bf746b7ae2497d
Reviewed-on: http://gerrit.cloudera.org:8080/15854
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Eliminate redundant Ranger audits when a query involves column masking
> ----------------------------------------------------------------------
>
> Key: IMPALA-9597
> URL: https://issues.apache.org/jira/browse/IMPALA-9597
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Fang-Yu Rao
> Assignee: Fang-Yu Rao
> Priority: Major
>
> After IMPALA-9350, Impala is able to produce the corresponding Ranger audits
> when a query involves policies of column masking. However, redundant audit
> events could be produced in some cases.
> For example, currently Impala will always generate audit events for column
> masking even though the requesting user is not granted the necessary
> privilege on the specified resource because
> {{AuthorizationChecker#postAuthorize()}} is always called whether there is an
> {{AuthorizationException}} or not.
> Another example is that if a table occurs several times in a query, we would
> have duplicate audits for the same column involved in a column masking
> policy. Take the following query for example, since the query would result in
> 2 calls to {{SelectStmt#analyze()}} on the same table, given that there is a
> column masking policy for the column of {{string_col}}, we will see 2
> duplicate audit events for this column.
> {noformat}
> with iv as (select id, bool_col, string_col from functional.alltypestiny)
> select * from iv;
> {noformat}
> We should thus eliminate the redundant audits in the cases described above.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]