Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17199 )

Change subject: IMPALA-9661: Avoid introducing unused columns in table masking 
view
......................................................................


Patch Set 3:

(7 comments)

Thank for the quick review! Added more comments.

http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@491
PS3, Line 491: no user-visible effect
> Is the 'no user-visible effect' accurate though ? The plan in the query pro
Yeah, you are right. These are old comments. We should update them. I think it 
means the results will have the same types and labels as the original query 
regardless how we rewrite it. Even without table masking, the query plan may 
change after rewrite. So I think 'no user-visible effect' just means the query 
results.

Updated the comment.


http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@541
PS3, Line 541:     reAnalyzeWithoutPrivChecks(stmtTableCache, authzCtx, 
origResultTypes, origColLabels);
> Shouldn't this only be called if reAnalyze = true ?  I am also curious why
Yeah, when reAnalyze = false, we return at line 528. We call this at line 506 
because rewriters require the AST is analyzed. Table masking will create new 
InlineViewRefs which are not analyzed. We also need some SlotRefs being 
resolved again to reference output exprs of these InlineViewRefs. Added some 
comments about this.


http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@860
PS3, Line 860: column %s
> This says column but it seems the first parameter is resolvedTableRef's raw
Yeah, it's actually a collection column. Here's an example query when the 
resolvedTableRef is a non-relative CollectionTableRef:

 use functional_parquet;
 select item from complextypestbl.int_array;

At the parsing phase, "complextypestbl.int_array" is parsed as a TableRef. 
Before analyze(), we don't know whether it's a table named "int_array" in the 
database named "complextypestbl". After analyze(), the TableRef is resolved. 
It's now a CollectionTableRef and isRelative() returns false. At this case, the 
column name we get is "functional_parquet.complextypestbl.int_array". We have 
test coverage for this in ranger_row_filtering.test.

FWIW, here is a example for a relative CollectionTableRef:

 select a.item from complextypestbl t, t.int_array a;

The TableRef "t.int_array" is a relative CollectionTableRef. It references to 
the complextypestbl table.

Added some comments since I think the "relative" concept is not straighforword 
to understand.


http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1326
PS3, Line 1326: names
> nit: this is registering a Column object rather than just the name.
Done


http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1340
PS3, Line 1340:       analyzer = analyzer.getParentAnalyzer();
> Since getParentAnalyzer() can return null, we should break out of the loop
Good point! I think we should add assertion here. If we can't find the tblRef, 
it means the slotDesc is illegal.


http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@171
PS3, Line 171:   public boolean resolveTableMask(Analyzer analyzer) throws 
AnalysisException {
> Since this is not looking into the orderByElements_, I am wondering if it w
No, we just need to recurse in subqueries so don't need to consider ORDER BY 
yet.


http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@268
PS3, Line 268:   public boolean resolveTableMask(Analyzer analyzer) throws 
AnalysisException {
> Shouldn't this also consider the groupByClause_, havingClause_, aggregate a
We only need to recurse in subqueries to replace all the catalog table/view 
with table masking views. For SlotRefs referencing to the original table refs, 
they will be reset and re-analyzed to reference to the table masking view.



--
To view, visit http://gerrit.cloudera.org:8080/17199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
Gerrit-Change-Number: 17199
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Fri, 19 Mar 2021 14:03:38 +0000
Gerrit-HasComments: Yes

Reply via email to