Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/17811 )
Change subject: IMPALA-9498: Allow returning arrays in select list ...................................................................... Patch Set 31: (38 comments) Thanks for the comments, sorry if I have missed any of them. The change is patch 28 vs 30, 31 is just rebase. http://gerrit.cloudera.org:8080/#/c/17811/24/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/17811/24/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1477 PS24, Line 1477: : > Because of the added line with the null check and return, this precondition Done http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@713 PS28, Line 713: /** : * Creates an returns an empty TupleDescriptor for the given table ref and registers : > Is this used somewhere? removed it http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@770 PS28, Line 770: * Resolves the given TableRef into a concrete BaseTableRef, ViewRef or > I think this causes an issue that the collection can be incorrectly used in Thanks for finding the bug and the hint about setHidden()! The cause was a bit elsewhere - when registering the CollectionTableRef backing arrays in select lists, the CollectionTableRef registered itself. This means that even the order in the select list mattered: this failed to find "pos": select pos, int_array from functional_parquet.complextypestbl this crashed: select int_array, pos from functional_parquet.complextypestbl Changed the logic to set these tablerefs to hidden, and also had to add similar logic to TupleDescriptors, as in case of int_array.pos, the path resolving logic found the tuple as root. Also added regression tests to AuthorizationStmtTest. http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1395 PS28, Line 1395: collectionTableRawPath.addAll(Arrays.asList(rootDesc.getAlias().split("\\."))); > nit: Preconditions.checkNotNull(rootDesc); Done http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1405 PS28, Line 1405: SlotDescriptor desc = ((SlotRef) > Is it possible to add a check about this? I wonder if all the code paths go This would be called whenever a slot ref with a path that points to an array is analyzes - currently this can only mean arrays in select list, as no other expression can contain arrays. This can change later, but until now we used "in select list" as a name in array and struct implementation. http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1408 PS28, Line 1408: // Resolve path > nit: this is covered by the check at next line so can be removed. Done http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1414 PS28, Line 1414: Preconditions.checkState(isResolved); : > nit: we can use Lists.newArrayList("item") and make it a constant. kept it like this for now, but changed it to use a constant for the name. http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1429 PS28, Line 1429: lotPa > nit: don't need 'this' Done http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@1430 PS28, Line 1430: SlotDescriptor result = addSlotDescriptor(slotPath.getRootDesc()); > Should we only add stats for primitive types? We seem to do this at other places too, regardless of whether stats support the type: https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java#L176 Removed this line, as I think that it is not necessary. http://gerrit.cloudera.org:8080/#/c/17811/24/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java File fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java: http://gerrit.cloudera.org:8080/#/c/17811/24/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java@67 PS24, Line 67: > Nit: unnecessary space. Done http://gerrit.cloudera.org:8080/#/c/17811/25/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java File fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java: http://gerrit.cloudera.org:8080/#/c/17811/25/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java@117 PS25, Line 117: > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java File fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java: http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java@67 PS28, Line 67: > I think we should convert this to lower case as well. Done http://gerrit.cloudera.org:8080/#/c/17811/24/fe/src/main/java/org/apache/impala/analysis/FromClause.java File fe/src/main/java/org/apache/impala/analysis/FromClause.java: http://gerrit.cloudera.org:8080/#/c/17811/24/fe/src/main/java/org/apache/impala/analysis/FromClause.java@112 PS24, Line 112: throw new AnalysisException( : > Nit: lines commented out. Done http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java File fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java: http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java@198 PS28, Line 198: public void analyze(Analyzer analyzer) throws AnalysisException { > This method is too long now. Can we extract some codes into methods? Moved the logic in the loop to "addColumnToSubstitutionMaps()" http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java@277 PS28, Line 277: Analyzer analyzer, String colName, Expr colExpr, Expr baseTableExpr) > nit: this is the same as the if-clause so we can remove it. Done http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java@286 PS28, Line 286: > typo: immediate Done http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@392 PS28, Line 392: "currently not all complex-typed exprs " + : "are supported in the select list.\n" + > nit: I think we should update this as well Done http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@400 PS28, Line 400: if (!arrayType.getItemType().isSupported()) { > Should we check the subtypes recursively here? Do we have test coverage for I don't think that we test for types like that. Though we don't check the nested types recursively here, we do that during SlorRef.analyzeImpl(). Let me think a bit more on how to handle + test this. http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@544 PS28, Line 544: // Gather the inline view substitution maps from the enclosed inline views > Unused? You are right, this remained here from a previous implementation of substitution of arrays from views. http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/SlotRef.java File fe/src/main/java/org/apache/impala/analysis/SlotRef.java: http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@a105 PS28, Line 105: > Could you explain why we should remove this? It is done during super(other); http://gerrit.cloudera.org:8080/#/c/17811/24/fe/src/main/java/org/apache/impala/analysis/TableRef.java File fe/src/main/java/org/apache/impala/analysis/TableRef.java: http://gerrit.cloudera.org:8080/#/c/17811/24/fe/src/main/java/org/apache/impala/analysis/TableRef.java@417 PS24, Line 417: Pr > Nit: line commented out. Done http://gerrit.cloudera.org:8080/#/c/17811/24/fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java File fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java: http://gerrit.cloudera.org:8080/#/c/17811/24/fe/src/main/java/org/apache/impala/analysis/UnnestExpr.java@78 PS24, Line 78: throw new AnalysisException( : > Nit: lines commented out. Done http://gerrit.cloudera.org:8080/#/c/17811/27/fe/src/main/java/org/apache/impala/authorization/TableMask.java File fe/src/main/java/org/apache/impala/authorization/TableMask.java: http://gerrit.cloudera.org:8080/#/c/17811/27/fe/src/main/java/org/apache/impala/authorization/TableMask.java@94 PS27, Line 94: colName.contains(".")) { > nit: can we use colName.contains("\\.")? Done http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/planner/PlanNode.java@161 PS28, Line 161: /** : * Deferred id_ assignment. : */ : protected PlanNode(String displayName) { : this(null, displayName); : } : : p > It seems this is no longer used. I think we can remove it. Done http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/planner/UnnestNode.java File fe/src/main/java/org/apache/impala/planner/UnnestNode.java: http://gerrit.cloudera.org:8080/#/c/17811/28/fe/src/main/java/org/apache/impala/planner/UnnestNode.java@63 PS28, Line 63: tupleIds_.add(collectionSlotRef.getDesc().getItemTupleDesc().getId()); : tblRefIds_.add(collectionSlotRef.getDesc().getItemTupleDesc().getId()); > nit: we can add the id directly Done http://gerrit.cloudera.org:8080/#/c/17811/27/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/17811/27/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@296 PS27, Line 296: testCollectionTableRefs("complex_nested_struct_col.f2.f12", "f21"); > Does this fail? Oops, forgot that it was commented out. Fixed the test. http://gerrit.cloudera.org:8080/#/c/17811/27/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@1042 PS27, Line 1042: "union all select int_array_col from functional.allcomplextypes"); : AnalysisError("select int_array_col, item from functional.allcomplexty > It seems working now. Maybe this is a stale comment. Done http://gerrit.cloudera.org:8080/#/c/17811/27/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@1045 PS27, Line 1045: AnalysisError("select int_array_col, int_array_col.item " + > Can we add another test like this? Added a similar test in TestCollectionTableRefs http://gerrit.cloudera.org:8080/#/c/17811/27/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java File fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java: http://gerrit.cloudera.org:8080/#/c/17811/27/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@710 PS27, Line 710: , > nit: need one space Done http://gerrit.cloudera.org:8080/#/c/17811/27/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@719 PS27, Line 719: "allcomplextypes", new String[]{"array_array_col"}, TP > nit: I think we can simply use TPrivilegeLevel.SELECT here, because it's mi Done http://gerrit.cloudera.org:8080/#/c/17811/27/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@723 PS27, Line 723: orize("select * from functional.alltypes union all " + > nit: same as above, I think we can simply use TPrivilegeLevel.SELECT. Done http://gerrit.cloudera.org:8080/#/c/17811/24/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test File testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test: http://gerrit.cloudera.org:8080/#/c/17811/24/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@57 PS24, Line 57: ==== : ---- QUERY : # Same collection used from two versions of the same table/ : select t1.id, t1.int_array, t2.int_array : from complextypestbl t1 join complextypestbl t2 : on t1.id = t2.id : ---- RESULTS : 3,'[]','[]' : 5,'NULL','NULL' : 7,'NULL','NULL' : 8,'[-1]','[-1]' : 1,'[1,2,3]','[1,2,3]' : 2,'[NULL,1,2,NULL,3,NULL]','[NULL,1,2,NULL,3,NULL]' : 4,'NULL','NULL' : 6,'NULL','NULL' : ---- TYPES : bi > Nit: query commented out. Done http://gerrit.cloudera.org:8080/#/c/17811/27/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test File testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test: http://gerrit.cloudera.org:8080/#/c/17811/27/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@105 PS27, Line 105: Changing a column to a different type > I'm confused that this also get rejected: The trick here is that 1 is tinyint, while id is int - this means the union is no longer pass-through http://gerrit.cloudera.org:8080/#/c/17811/27/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@187 PS27, Line 187: clause > nit: clause Done http://gerrit.cloudera.org:8080/#/c/17811/27/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@198 PS27, Line 198: clause > nit: clause Done http://gerrit.cloudera.org:8080/#/c/17811/28/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test File testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test: http://gerrit.cloudera.org:8080/#/c/17811/28/testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test@36 PS28, Line 36: ==== > Could you add a test about selecting both STRUCT and ARRAY columns? Struct has some limitations (no codegen, only ORC), so I wouldn't add it here. http://gerrit.cloudera.org:8080/#/c/17811/27/testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test File testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test: http://gerrit.cloudera.org:8080/#/c/17811/27/testdata/workloads/functional-query/queries/QueryTest/zipping-unnest-in-select-list.test@170 PS27, Line 170: # Zipping unnest given in both select list and from clause. > stale comment? Done http://gerrit.cloudera.org:8080/#/c/17811/25/tests/query_test/test_nested_types.py File tests/query_test/test_nested_types.py: http://gerrit.cloudera.org:8080/#/c/17811/25/tests/query_test/test_nested_types.py@153 PS25, Line 153: > flake8: E302 expected 2 blank lines, found 1 Done -- To view, visit http://gerrit.cloudera.org:8080/17811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0 Gerrit-Change-Number: 17811 Gerrit-PatchSet: 31 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Attila Jeges <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Tue, 25 Jan 2022 09:16:43 +0000 Gerrit-HasComments: Yes
