Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
......................................................................


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
File fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java:

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java@175
PS2, Line 175:     // TODO(IMPALA-10483): Column-masking/Row-filtering 
expressions may have subqueries
Done. Added IMPALA-10483 in the comment.

> I think we might also have to be careful about exactly what subqueries are 
> allowed. E.g. if would be weird if you had a correlated references between 
> the row filter subquery and the outer query, or if you could have a relative 
> table reference that referred to different tables depending on which DB it 
> was executed in.

Yeah, currently we depend on users to verify the rules. I think Hive does so. 
Copy this to comments of IMPALA-10483 so we can take care of it once we support 
subquery filters.


http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
File 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java:

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@473
PS2, Line 473: given tables. This is used
> This is inaccurate - this method doesn't check whether row filtering is ena
Done


http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@476
PS2, Line 476: throwIfRowFilterin
> Rename this method to reflect new behaviour, e.g. throwIfRowFilteringRequir
Done. Also rename the authorizeColumnMask() method name.


http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
File 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java:

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java@139
PS2, Line 139:       // TODO: Whether we should use lowercase or uppercase 
accessType?
> How would we decide this? What does Hive do?
AccessTypes (e.g. SELECT, INSERT, MASK, ROW_FILTER) in Hive audit events are 
all in uppercase. They inherit from some constant strings witch are uppercase, 
e.g. result.getMaskType().

However, AccessTypes in Impala audit events are always converted to lowercase. 
I think it's not neccessary. But not sure if changing them to uppercase will 
break other tools, which should be considered as a breaking change.

Note that the AccessType field is only used for observerbility.


http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
File 
fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java:

http://gerrit.cloudera.org:8080/#/c/16976/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@454
PS2, Line 454:     // Two row filter policies will be added. The first one 
affects 'user_' and keeps rows
> Add a brief comment describing the row filter policies added.
Done


http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
File 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test@3
PS2, Line 3: # Row-filtering policy keeps rows with "id % 3 = 0".
> Is the expected behaviour that row filters are applied before column masks?
Yes, column masking applies on result data. So the row filtering happens first. 
I mention this in the commit message. Also copied it here for readability.


http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
File 
testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test:

http://gerrit.cloudera.org:8080/#/c/16976/2/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@1
PS2, Line 1: ====
> Can you add a test where the row filter references a column not in the tabl
Good point! But this corner case won't happen, because 'alltypes' will be 
replaced by an inline view and the filter is added inside the view. The 
unresolved column won't be resolved as correlated reference.

To be specifit, let's say table 'alltypes' has a row filter policy of "test_id 
= id". Table 'jointbl' has a column 'test_id' which doesn't exist in 
'alltypes'. The query

  select * from jointbl
  where exists(select * from alltypes)

will be analyzed as

  select * from jointbl
  where exists(select * from
      (select id, bool_col, ..., month from alltypes where test_id = id) v
  )

and will hit an error in resolving 'test_id'.

Added a similar test to cover this.



--
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Fri, 19 Feb 2021 09:39:13 +0000
Gerrit-HasComments: Yes

Reply via email to