Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17199 )
Change subject: IMPALA-9661: Avoid introducing unused columns in table masking view ...................................................................... Patch Set 3: (7 comments) Thank for the quick review! Added more comments. http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java: http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@491 PS3, Line 491: no user-visible effect > Is the 'no user-visible effect' accurate though ? The plan in the query pro Yeah, you are right. These are old comments. We should update them. I think it means the results will have the same types and labels as the original query regardless how we rewrite it. Even without table masking, the query plan may change after rewrite. So I think 'no user-visible effect' just means the query results. Updated the comment. http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@541 PS3, Line 541: reAnalyzeWithoutPrivChecks(stmtTableCache, authzCtx, origResultTypes, origColLabels); > Shouldn't this only be called if reAnalyze = true ? I am also curious why Yeah, when reAnalyze = false, we return at line 528. We call this at line 506 because rewriters require the AST is analyzed. Table masking will create new InlineViewRefs which are not analyzed. We also need some SlotRefs being resolved again to reference output exprs of these InlineViewRefs. Added some comments about this. http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@860 PS3, Line 860: column %s > This says column but it seems the first parameter is resolvedTableRef's raw Yeah, it's actually a collection column. Here's an example query when the resolvedTableRef is a non-relative CollectionTableRef: use functional_parquet; select item from complextypestbl.int_array; At the parsing phase, "complextypestbl.int_array" is parsed as a TableRef. Before analyze(), we don't know whether it's a table named "int_array" in the database named "complextypestbl". After analyze(), the TableRef is resolved. It's now a CollectionTableRef and isRelative() returns false. At this case, the column name we get is "functional_parquet.complextypestbl.int_array". We have test coverage for this in ranger_row_filtering.test. FWIW, here is a example for a relative CollectionTableRef: select a.item from complextypestbl t, t.int_array a; The TableRef "t.int_array" is a relative CollectionTableRef. It references to the complextypestbl table. Added some comments since I think the "relative" concept is not straighforword to understand. http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1326 PS3, Line 1326: names > nit: this is registering a Column object rather than just the name. Done http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1340 PS3, Line 1340: analyzer = analyzer.getParentAnalyzer(); > Since getParentAnalyzer() can return null, we should break out of the loop Good point! I think we should add assertion here. If we can't find the tblRef, it means the slotDesc is illegal. http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java: http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@171 PS3, Line 171: public boolean resolveTableMask(Analyzer analyzer) throws AnalysisException { > Since this is not looking into the orderByElements_, I am wondering if it w No, we just need to recurse in subqueries so don't need to consider ORDER BY yet. http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/17199/3/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@268 PS3, Line 268: public boolean resolveTableMask(Analyzer analyzer) throws AnalysisException { > Shouldn't this also consider the groupByClause_, havingClause_, aggregate a We only need to recurse in subqueries to replace all the catalog table/view with table masking views. For SlotRefs referencing to the original table refs, they will be reset and re-analyzed to reference to the table masking view. -- To view, visit http://gerrit.cloudera.org:8080/17199 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1 Gerrit-Change-Number: 17199 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Fri, 19 Mar 2021 14:03:38 +0000 Gerrit-HasComments: Yes
