Daniel Becker has uploaded a new patch set (#7). (
http://gerrit.cloudera.org:8080/20108 )
Change subject: IMPALA-12159: Support ORDER BY for collections of variable
length types in select list
......................................................................
IMPALA-12159: Support ORDER BY for collections of variable length types in
select list
IMPALA-12019 implemented support for collections of fixed length types
in the sorting tuple. This change implements it for collections of
variable length types.
Note that the limitation that structs that contain any type of
collection are not allowed in the sorting tuple is still in place (see
IMPALA-12160).
Note that it was not and still is not allowed to sort by complex types,
this change only allows them to be present in the select list when
sortin by some other expression.
This change also allows collections of variable length types to be
non-passthrough children of UNION ALL nodes.
Testing:
- Renamed the 'simple_arrays_big' table to 'arrays_big' and extended it
with collections containing variable length types. This table is
mainly used to test that spilling works during sorting.
- Renamed
test_sort.py::TestArraySort::{test_simple_arrays,
test_simple_arrays_with_limit}
to {test_array_sort,test_array_sort_with_limit}
- Extended the tests run in test_queries.py::TestQueries::{test_sort,
test_top_n,test_partitioned_top_n} with collections containing
var-len types.
- Added tests in sort-complex.test that assert that it is not allowed
to sort by collections. For structs we already have such tests in
struct-in-select-list.test.
Change-Id: Ic15b29393f260b572e11a8dbb9deeb8c02981852
---
M be/src/codegen/codegen-anyval-read-write-info.cc
M be/src/codegen/codegen-anyval-read-write-info.h
M be/src/runtime/collection-value.cc
M be/src/runtime/collection-value.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/sorter-internal.h
M be/src/runtime/sorter.cc
M be/src/runtime/tuple-ir.cc
M be/src/runtime/tuple.cc
M be/src/runtime/tuple.h
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
A testdata/ComplexTypesTbl/arrays_big.parq
D testdata/ComplexTypesTbl/simple_arrays_big.parq
M testdata/data/README
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M
testdata/workloads/functional-query/queries/QueryTest/nested-array-in-select-list.test
M
testdata/workloads/functional-query/queries/QueryTest/nested-map-in-select-list.test
M
testdata/workloads/functional-query/queries/QueryTest/partitioned-top-n-complex.test
M testdata/workloads/functional-query/queries/QueryTest/sort-complex.test
M testdata/workloads/functional-query/queries/QueryTest/top-n-complex.test
M tests/query_test/test_queries.py
M tests/query_test/test_sort.py
31 files changed, 1,367 insertions(+), 582 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/20108/7
--
To view, visit http://gerrit.cloudera.org:8080/20108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic15b29393f260b572e11a8dbb9deeb8c02981852
Gerrit-Change-Number: 20108
Gerrit-PatchSet: 7
Gerrit-Owner: Daniel Becker <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>