[
https://issues.apache.org/jira/browse/IMPALA-9661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17306239#comment-17306239
]
ASF subversion and git services commented on IMPALA-9661:
---------------------------------------------------------
Commit c9d7bcb4a1e77deaa431b152b0833e46cb267239 in impala's branch
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c9d7bcb ]
IMPALA-9661: Avoid introducing unused columns in table masking view
Previously, if a table has column masking policies, we replace its
unanalyzed TableRef with an analyzed InlineViewRef (table masking view)
in FromClause.analyze(). However, we can't detect which columns are
actually used in the original query at this point. In fact, analyze()
for SelectList, WhereClause, GroupByClause and other clauses containing
SlotRefs happen after FromClause.analyze(). After the whole query block
is analyzed, we can get the exact set of required columns.
This patch refactor the codes to do table masking after analyze() to
avoid introducing unused columns. Referenced columns of a TableRef are
registered in analyze(), which helps to figure out what columns are
actually needed.
With this, we don't need to revert table masking in FromClause.reset().
The doTableMasking flag in AST is also removed since now the table mask
is resolved once after analyze().
Tests:
- Add more e2e tests in test_ranger.py
- Run CORE tests
Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
Reviewed-on: http://gerrit.cloudera.org:8080/17199
Reviewed-by: Aman Sinha <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Avoid introducing unused columns in table masking view
> ------------------------------------------------------
>
> Key: IMPALA-9661
> URL: https://issues.apache.org/jira/browse/IMPALA-9661
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
>
> If a table has column masking policies, we replace its unanalyzed TableRef
> with an analyzed InlineViewRef (table masking view) in FromClause.analyze().
> However, we can't detect which columns are actually used in the original
> query at this point. In fact, analyze() for SelectList, WhereClause,
> GroupByClause and other clauses containing SlotRefs happen after
> FromClause.analyze(). After the whole query block is analyzed, we can get the
> exact set of required columns. We should do table masking there to avoid
> introducing unused columns.
> To be specifit, if table _tbl_(_id_ int, _name_ string, _address_ string) has
> column masking policies for column _name_ and _address_ to mask them, the
> following query
> {code:sql}
> select name from tbl where id > 10;
> {code}
> will be rewritten to
> {code:sql}
> select name from (
> select id, mask(name) as name, mask(address) as address from tbl
> ) tbl where id > 10;
> {code}
> The rewritten query introduce the requirement for SELECT privilege on the
> _address_ column which isn't required by the original query. We should either
> fix this or IMPALA-9223.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]