Qifan Chen has posted comments on this change. (
http://gerrit.cloudera.org:8080/18050 )
Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory - v2
......................................................................
Patch Set 14:
Fix three issues as follows.
P15:
Section DISTRIBUTEDPLAN of query:
select
l_orderkey,
sum(l_extendedprice * (1 - l_discount)) as revenue,
o_orderdate,
o_shippriority
from
customer,
orders,
lineitem
where
c_mktsegment = 'BUILDING'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < '1995-03-15'
and l_shipdate > '1995-03-15'
group by
l_orderkey,
o_orderdate,
o_shippriority
order by
revenue desc,
o_orderdate
limit 10
68 12:MERGING-EXCHANGE [UNPARTITIONED]
7569 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7570 | order by: sum(l_extendedprice * (1 - l_discount)) DESC, o_orderdate
ASC
7571 | limit: 10
7572 |
7573 06:TOP-N [LIMIT=10]
7574 | order by: sum(l_extendedprice * (1 - l_discount)) DESC, o_orderdate ASC
7575 | row-size=50B cardinality=10
7576 |
7577 11:AGGREGATE [FINALIZE]
7578 | output: sum:merge(l_extendedprice * (1 - l_discount))
7579 | group by: l_orderkey, o_orderdate, o_shippriority
7580 | row-size=50B cardinality=17.56K
7581 |
7582 10:EXCHANGE [HASH(l_orderkey,o_orderdate,o_shippriority)]
7583 |
7584 05:AGGREGATE [STREAMING]
7585 | output: sum(l_extendedprice * (1 - l_discount))
7586 | group by: l_orderkey, o_orderdate, o_shippriority
7587 | row-size=50B cardinality=17.56K
7588 |
7589 04:HASH JOIN [INNER JOIN, PARTITIONED]
7590 | hash predicates: o_custkey = c_custkey
7591 | runtime filters: RF000 <- c_custkey
7592 | row-size=0B cardinality=17.56K
7593 |
7594 |--09:EXCHANGE [HASH(c_custkey)]
7595 | |
7596 | 00:SCAN HDFS [tpch.customer]
7597 | HDFS partitions=1/1 files=1 size=23.08MB
7598 | predicates: c_mktsegment = 'BUILDING'
7599 | row-size=0B cardinality=30.00K
7600 |
7601 08:EXCHANGE [HASH(o_custkey)] <==== this is extra. Fixed by copy
avgRowSize in PlanNode. Exchange node above a scan should see scan’s avg row
size as 28.9974. It was 0 instead.
7602 |
7603 03:HASH JOIN [INNER JOIN, BROADCAST]
7604 | hash predicates: l_orderkey = o_orderkey
P16:
Query:
in tpch-all.test
use tpch_nested_parquet;
select
o_orderpriority,
count(*) as order_count
from
customer c,
c.c_orders o
where
o_orderdate >= '1993-07-01'
and o_orderdate < '1993-10-01'
and exists (
select
*
from
o.o_lineitems
where
l_commitdate < l_receiptdate
)
group by
o_orderpriority
order by
o_orderpriority
Error Stack:
java.lang.IllegalStateException: Must be analyzed before serializing to thrift.
IsNotEmptyPredicate{id=null, type=INVALID_TYPE, toSql=!empty(c.c_orders),
sel=-1.0, evalCost=-1.0, #distinct=-1}. <== resolved by fixing the
IsNotEmptyPredicate.lone() method.
at com.google.common.base.Preconditions.checkState(Preconditions.java:589)
at org.apache.impala.analysis.Expr.treeToThriftHelper(Expr.java:850)
at org.apache.impala.analysis.Expr.treeToThrift(Expr.java:844)
at org.apache.impala.planner.PlanNode.treeToThriftHelper(PlanNode.java:489
P17:
select * from (
select int_col, bigint_col, smallint_col,
rank() over (partition by int_col order by smallint_col desc) rk
from functional.alltypesagg) dt
where rk <= 10
order by int_col, bigint_col, smallint_col, rk
limit 10;
Error Stack:
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 <======= resolved
by smart copying the embedded ancestor analyticEvalNode_ only once
at java.util.ArrayList.rangeCheck(ArrayList.java:659)
at java.util.ArrayList.set(ArrayList.java:450)
at org.apache.impala.common.TreeNode.setChild(TreeNode.java:50)
at
org.apache.impala.planner.DistributedPlanner.createAnalyticFragment(DistributedPlanner.java:1149)
at
org.apache.impala.planner.DistributedPlanner.createPlanFragments(DistributedPlanner.java:137)
--
To view, visit http://gerrit.cloudera.org:8080/18050
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8a31a574b364f39b049a4bae33a8b98c5fc20bd
Gerrit-Change-Number: 18050
Gerrit-PatchSet: 14
Gerrit-Owner: Qifan Chen <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Comment-Date: Fri, 10 Dec 2021 20:29:27 +0000
Gerrit-HasComments: No