Michael Ho created IMPALA-6237: ---------------------------------- Summary: Mismatch in plannerTest.testJoins output Key: IMPALA-6237 URL: https://issues.apache.org/jira/browse/IMPALA-6237 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 2.11.0 Reporter: Michael Ho Priority: Blocker
PlannerTest started failing quite consistently quite recently. The followings are some of the mismatch between the expected and actual outputs: [~tianyiwang], [~alex.behm], any chance this may be related to recent changes in the planner ? {noformat} Error Message Section DISTRIBUTEDPLAN of query: select * from functional.alltypesagg a full outer join functional.alltypessmall b using (id, int_col) right join functional.alltypesaggnonulls c on (a.id = c.id and b.string_col = c.string_col) where a.day >= 6 and b.month > 2 and c.day < 3 and a.tinyint_col = 15 and b.string_col = '15' and a.tinyint_col + b.tinyint_col < 15 and a.float_col - c.double_col < 0 and (b.double_col * c.tinyint_col > 1000 or c.tinyint_col < 1000) Actual does not match expected result: PLAN-ROOT SINK | 09:EXCHANGE [UNPARTITIONED] | 04:HASH JOIN [LEFT OUTER JOIN, PARTITIONED] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | hash predicates: c.id = a.id, c.string_col = b.string_col | other predicates: a.tinyint_col = 15, b.string_col = '15', a.day >= 6, b.month > 2, a.float_col - c.double_col < 0, a.tinyint_col + b.tinyint_col < 15, (b.double_col * c.tinyint_col > 1000 OR c.tinyint_col < 1000) | |--08:EXCHANGE [HASH(a.id,b.string_col)] | | | 03:HASH JOIN [FULL OUTER JOIN, PARTITIONED] | | hash predicates: a.id = b.id, a.int_col = b.int_col | | | |--06:EXCHANGE [HASH(b.id,b.int_col)] | | | | | 01:SCAN HDFS [functional.alltypessmall b] | | partitions=2/4 files=2 size=3.17KB | | predicates: b.string_col = '15' | | | 05:EXCHANGE [HASH(a.id,a.int_col)] | | | 00:SCAN HDFS [functional.alltypesagg a] | partitions=5/11 files=5 size=372.38KB | predicates: a.tinyint_col = 15 | 07:EXCHANGE [HASH(c.id,c.string_col)] | 02:SCAN HDFS [functional.alltypesaggnonulls c] partitions=2/10 files=2 size=148.10KB Expected: PLAN-ROOT SINK | 09:EXCHANGE [UNPARTITIONED] | 04:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED] | hash predicates: a.id = c.id, b.string_col = c.string_col | other predicates: a.tinyint_col = 15, b.string_col = '15', a.day >= 6, b.month > 2, a.float_col - c.double_col < 0, a.tinyint_col + b.tinyint_col < 15, (b.double_col * c.tinyint_col > 1000 OR c.tinyint_col < 1000) | runtime filters: RF000 <- c.id, RF001 <- c.string_col | |--08:EXCHANGE [HASH(c.id,c.string_col)] | | | 02:SCAN HDFS [functional.alltypesaggnonulls c] | partitions=2/10 files=2 size=148.10KB | 07:EXCHANGE [HASH(a.id,b.string_col)] | 03:HASH JOIN [FULL OUTER JOIN, PARTITIONED] | hash predicates: a.id = b.id, a.int_col = b.int_col | |--06:EXCHANGE [HASH(b.id,b.int_col)] | | | 01:SCAN HDFS [functional.alltypessmall b] | partitions=2/4 files=2 size=3.17KB | predicates: b.string_col = '15' | runtime filters: RF001 -> b.string_col | 05:EXCHANGE [HASH(a.id,a.int_col)] | 00:SCAN HDFS [functional.alltypesagg a] partitions=5/11 files=5 size=372.38KB predicates: a.tinyint_col = 15 runtime filters: RF000 -> a.id Verbose plan: F05:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 Per-Host Resources: mem-estimate=0B mem-reservation=0B PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 09:EXCHANGE [UNPARTITIONED] mem-estimate=0B mem-reservation=0B tuple-ids=2,0N,1N row-size=303B cardinality=2000 F04:PLAN FRAGMENT [HASH(c.id,c.string_col)] hosts=2 instances=2 Per-Host Resources: mem-estimate=1.94MB mem-reservation=1.94MB DATASTREAM SINK [FRAGMENT=F05, EXCHANGE=09, UNPARTITIONED] | mem-estimate=0B mem-reservation=0B 04:HASH JOIN [LEFT OUTER JOIN, PARTITIONED] | hash predicates: c.id = a.id, c.string_col = b.string_col | fk/pk conjuncts: c.id = a.id | other predicates: a.tinyint_col = 15, b.string_col = '15', a.day >= 6, b.month > 2, a.float_col - c.double_col < 0, a.tinyint_col + b.tinyint_col < 15, (b.double_col * c.tinyint_col > 1000 OR c.tinyint_col < 1000) | mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB | tuple-ids=2,0N,1N row-size=303B cardinality=2000 | |--08:EXCHANGE [HASH(a.id,b.string_col)] | mem-estimate=0B mem-reservation=0B | tuple-ids=0N,1N row-size=200B cardinality=561 | 07:EXCHANGE [HASH(c.id,c.string_col)] mem-estimate=0B mem-reservation=0B tuple-ids=2 row-size=103B cardinality=2000 F00:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B DATASTREAM SINK [FRAGMENT=F04, EXCHANGE=07, HASH(c.id,c.string_col)] | mem-estimate=0B mem-reservation=0B 02:SCAN HDFS [functional.alltypesaggnonulls c, RANDOM] partitions=2/10 files=2 size=148.10KB stats-rows=2000 extrapolated-rows=disabled table stats: rows=10000 size=744.82KB column stats: all mem-estimate=32.00MB mem-reservation=0B tuple-ids=2 row-size=103B cardinality=2000 F03:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 Per-Host Resources: mem-estimate=1.94MB mem-reservation=1.94MB DATASTREAM SINK [FRAGMENT=F04, EXCHANGE=08, HASH(a.id,b.string_col)] | mem-estimate=0B mem-reservation=0B 03:HASH JOIN [FULL OUTER JOIN, PARTITIONED] | hash predicates: a.id = b.id, a.int_col = b.int_col | fk/pk conjuncts: a.id = b.id, a.int_col = b.int_col | mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB | tuple-ids=0N,1N row-size=200B cardinality=561 | |--06:EXCHANGE [HASH(b.id,b.int_col)] | mem-estimate=0B mem-reservation=0B | tuple-ids=1 row-size=97B cardinality=5 | 05:EXCHANGE [HASH(a.id,a.int_col)] mem-estimate=0B mem-reservation=0B tuple-ids=0 row-size=103B cardinality=556 F01:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 Per-Host Resources: mem-estimate=48.00MB mem-reservation=0B DATASTREAM SINK [FRAGMENT=F03, EXCHANGE=05, HASH(a.id,a.int_col)] | mem-estimate=0B mem-reservation=0B 00:SCAN HDFS [functional.alltypesagg a, RANDOM] partitions=5/11 files=5 size=372.38KB predicates: a.tinyint_col = 15 stats-rows=5000 extrapolated-rows=disabled table stats: rows=11000 size=814.73KB column stats: all parquet dictionary predicates: a.tinyint_col = 15 mem-estimate=48.00MB mem-reservation=0B tuple-ids=0 row-size=103B cardinality=556 F02:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B DATASTREAM SINK [FRAGMENT=F03, EXCHANGE=06, HASH(b.id,b.int_col)] | mem-estimate=0B mem-reservation=0B 01:SCAN HDFS [functional.alltypessmall b, RANDOM] partitions=2/4 files=2 size=3.17KB predicates: b.string_col = '15' stats-rows=50 extrapolated-rows=disabled table stats: rows=100 size=6.32KB column stats: all parquet dictionary predicates: b.string_col = '15' mem-estimate=32.00MB mem-reservation=0B tuple-ids=1 row-size=97B cardinality=5 Stacktrace java.lang.AssertionError: Section DISTRIBUTEDPLAN of query: select * from functional.alltypesagg a full outer join functional.alltypessmall b using (id, int_col) right join functional.alltypesaggnonulls c on (a.id = c.id and b.string_col = c.string_col) where a.day >= 6 and b.month > 2 and c.day < 3 and a.tinyint_col = 15 and b.string_col = '15' and a.tinyint_col + b.tinyint_col < 15 and a.float_col - c.double_col < 0 and (b.double_col * c.tinyint_col > 1000 or c.tinyint_col < 1000) Actual does not match expected result: PLAN-ROOT SINK | 09:EXCHANGE [UNPARTITIONED] | 04:HASH JOIN [LEFT OUTER JOIN, PARTITIONED] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | hash predicates: c.id = a.id, c.string_col = b.string_col | other predicates: a.tinyint_col = 15, b.string_col = '15', a.day >= 6, b.month > 2, a.float_col - c.double_col < 0, a.tinyint_col + b.tinyint_col < 15, (b.double_col * c.tinyint_col > 1000 OR c.tinyint_col < 1000) | |--08:EXCHANGE [HASH(a.id,b.string_col)] | | | 03:HASH JOIN [FULL OUTER JOIN, PARTITIONED] | | hash predicates: a.id = b.id, a.int_col = b.int_col | | | |--06:EXCHANGE [HASH(b.id,b.int_col)] | | | | | 01:SCAN HDFS [functional.alltypessmall b] | | partitions=2/4 files=2 size=3.17KB | | predicates: b.string_col = '15' | | | 05:EXCHANGE [HASH(a.id,a.int_col)] | | | 00:SCAN HDFS [functional.alltypesagg a] | partitions=5/11 files=5 size=372.38KB | predicates: a.tinyint_col = 15 | 07:EXCHANGE [HASH(c.id,c.string_col)] | 02:SCAN HDFS [functional.alltypesaggnonulls c] partitions=2/10 files=2 size=148.10KB Expected: PLAN-ROOT SINK | 09:EXCHANGE [UNPARTITIONED] | 04:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED] | hash predicates: a.id = c.id, b.string_col = c.string_col | other predicates: a.tinyint_col = 15, b.string_col = '15', a.day >= 6, b.month > 2, a.float_col - c.double_col < 0, a.tinyint_col + b.tinyint_col < 15, (b.double_col * c.tinyint_col > 1000 OR c.tinyint_col < 1000) | runtime filters: RF000 <- c.id, RF001 <- c.string_col | |--08:EXCHANGE [HASH(c.id,c.string_col)] | | | 02:SCAN HDFS [functional.alltypesaggnonulls c] | partitions=2/10 files=2 size=148.10KB | 07:EXCHANGE [HASH(a.id,b.string_col)] | 03:HASH JOIN [FULL OUTER JOIN, PARTITIONED] | hash predicates: a.id = b.id, a.int_col = b.int_col | |--06:EXCHANGE [HASH(b.id,b.int_col)] | | | 01:SCAN HDFS [functional.alltypessmall b] | partitions=2/4 files=2 size=3.17KB | predicates: b.string_col = '15' | runtime filters: RF001 -> b.string_col | 05:EXCHANGE [HASH(a.id,a.int_col)] | 00:SCAN HDFS [functional.alltypesagg a] partitions=5/11 files=5 size=372.38KB predicates: a.tinyint_col = 15 runtime filters: RF000 -> a.id Verbose plan: F05:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 Per-Host Resources: mem-estimate=0B mem-reservation=0B PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 09:EXCHANGE [UNPARTITIONED] mem-estimate=0B mem-reservation=0B tuple-ids=2,0N,1N row-size=303B cardinality=2000 F04:PLAN FRAGMENT [HASH(c.id,c.string_col)] hosts=2 instances=2 Per-Host Resources: mem-estimate=1.94MB mem-reservation=1.94MB DATASTREAM SINK [FRAGMENT=F05, EXCHANGE=09, UNPARTITIONED] | mem-estimate=0B mem-reservation=0B 04:HASH JOIN [LEFT OUTER JOIN, PARTITIONED] | hash predicates: c.id = a.id, c.string_col = b.string_col | fk/pk conjuncts: c.id = a.id | other predicates: a.tinyint_col = 15, b.string_col = '15', a.day >= 6, b.month > 2, a.float_col - c.double_col < 0, a.tinyint_col + b.tinyint_col < 15, (b.double_col * c.tinyint_col > 1000 OR c.tinyint_col < 1000) | mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB | tuple-ids=2,0N,1N row-size=303B cardinality=2000 | |--08:EXCHANGE [HASH(a.id,b.string_col)] | mem-estimate=0B mem-reservation=0B | tuple-ids=0N,1N row-size=200B cardinality=561 | 07:EXCHANGE [HASH(c.id,c.string_col)] mem-estimate=0B mem-reservation=0B tuple-ids=2 row-size=103B cardinality=2000 F00:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B DATASTREAM SINK [FRAGMENT=F04, EXCHANGE=07, HASH(c.id,c.string_col)] | mem-estimate=0B mem-reservation=0B 02:SCAN HDFS [functional.alltypesaggnonulls c, RANDOM] partitions=2/10 files=2 size=148.10KB stats-rows=2000 extrapolated-rows=disabled table stats: rows=10000 size=744.82KB column stats: all mem-estimate=32.00MB mem-reservation=0B tuple-ids=2 row-size=103B cardinality=2000 F03:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 Per-Host Resources: mem-estimate=1.94MB mem-reservation=1.94MB DATASTREAM SINK [FRAGMENT=F04, EXCHANGE=08, HASH(a.id,b.string_col)] | mem-estimate=0B mem-reservation=0B 03:HASH JOIN [FULL OUTER JOIN, PARTITIONED] | hash predicates: a.id = b.id, a.int_col = b.int_col | fk/pk conjuncts: a.id = b.id, a.int_col = b.int_col | mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB | tuple-ids=0N,1N row-size=200B cardinality=561 | |--06:EXCHANGE [HASH(b.id,b.int_col)] | mem-estimate=0B mem-reservation=0B | tuple-ids=1 row-size=97B cardinality=5 | 05:EXCHANGE [HASH(a.id,a.int_col)] mem-estimate=0B mem-reservation=0B tuple-ids=0 row-size=103B cardinality=556 F01:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 Per-Host Resources: mem-estimate=48.00MB mem-reservation=0B DATASTREAM SINK [FRAGMENT=F03, EXCHANGE=05, HASH(a.id,a.int_col)] | mem-estimate=0B mem-reservation=0B 00:SCAN HDFS [functional.alltypesagg a, RANDOM] partitions=5/11 files=5 size=372.38KB predicates: a.tinyint_col = 15 stats-rows=5000 extrapolated-rows=disabled table stats: rows=11000 size=814.73KB column stats: all parquet dictionary predicates: a.tinyint_col = 15 mem-estimate=48.00MB mem-reservation=0B tuple-ids=0 row-size=103B cardinality=556 F02:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B DATASTREAM SINK [FRAGMENT=F03, EXCHANGE=06, HASH(b.id,b.int_col)] | mem-estimate=0B mem-reservation=0B 01:SCAN HDFS [functional.alltypessmall b, RANDOM] partitions=2/4 files=2 size=3.17KB predicates: b.string_col = '15' stats-rows=50 extrapolated-rows=disabled table stats: rows=100 size=6.32KB column stats: all parquet dictionary predicates: b.string_col = '15' mem-estimate=32.00MB mem-reservation=0B tuple-ids=1 row-size=97B cardinality=5 at org.junit.Assert.fail(Assert.java:88) at org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:770) at org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:775) at org.apache.impala.planner.PlannerTest.testJoins(PlannerTest.java:127) {noformat} {noformat} Error Message Section DISTRIBUTEDPLAN of query: select sum(l_extendedprice) / 7.0 as avg_yearly from customer.c_orders.o_lineitems l, part p where p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' and l_quantity < ( select 0.2 * avg(l_quantity) from customer.c_orders.o_lineitems l where l_partkey = p_partkey ) Actual does not match expected result: PLAN-ROOT SINK | 12:AGGREGATE [FINALIZE] | output: sum:merge(l_extendedprice) | 11:EXCHANGE [UNPARTITIONED] | 06:AGGREGATE | output: sum(l_extendedprice) | 05:HASH JOIN [LEFT SEMI JOIN, BROADCAST] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | hash predicates: p_partkey = l_partkey | other join predicates: l_quantity < 0.2 * avg(l_quantity) | runtime filters: RF000 <- l_partkey | |--10:EXCHANGE [BROADCAST] | | | 09:AGGREGATE [FINALIZE] | | output: avg:merge(l_quantity) | | group by: l_partkey | | | 08:EXCHANGE [HASH(l_partkey)] | | | 03:AGGREGATE [STREAMING] | | output: avg(l_quantity) | | group by: l_partkey | | | 02:SCAN HDFS [tpch_nested_parquet.customer.c_orders.o_lineitems l] | partitions=1/1 files=4 size=292.36MB | 04:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: l_partkey = p_partkey | runtime filters: RF002 <- p_partkey | |--07:EXCHANGE [BROADCAST] | | | 01:SCAN HDFS [tpch_nested_parquet.part p] | partitions=1/1 files=1 size=6.23MB | predicates: p_container = 'MED BOX', p_brand = 'Brand#23' | runtime filters: RF000 -> p_partkey | 00:SCAN HDFS [tpch_nested_parquet.customer.c_orders.o_lineitems l] partitions=1/1 files=4 size=292.36MB runtime filters: RF000 -> l.l_partkey, RF002 -> l_partkey Expected: PLAN-ROOT SINK | 12:AGGREGATE [FINALIZE] | output: sum:merge(l_extendedprice) | 11:EXCHANGE [UNPARTITIONED] | 06:AGGREGATE | output: sum(l_extendedprice) | 05:HASH JOIN [LEFT SEMI JOIN, PARTITIONED] | hash predicates: p_partkey = l_partkey | other join predicates: l_quantity < 0.2 * avg(l_quantity) | runtime filters: RF000 <- l_partkey | |--09:AGGREGATE [FINALIZE] | | output: avg:merge(l_quantity) | | group by: l_partkey | | | 08:EXCHANGE [HASH(l_partkey)] | | | 03:AGGREGATE [STREAMING] | | output: avg(l_quantity) | | group by: l_partkey | | | 02:SCAN HDFS [tpch_nested_parquet.customer.c_orders.o_lineitems l] | partitions=1/1 files=4 size=292.36MB | 10:EXCHANGE [HASH(p_partkey)] | 04:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: l_partkey = p_partkey | runtime filters: RF002 <- p_partkey | |--07:EXCHANGE [BROADCAST] | | | 01:SCAN HDFS [tpch_nested_parquet.part p] | partitions=1/1 files=1 size=6.24MB | predicates: p_container = 'MED BOX', p_brand = 'Brand#23' | runtime filters: RF000 -> p_partkey | 00:SCAN HDFS [tpch_nested_parquet.customer.c_orders.o_lineitems l] partitions=1/1 files=4 size=292.36MB runtime filters: RF000 -> l.l_partkey, RF002 -> l_partkey Verbose plan: F04:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 Per-Host Resources: mem-estimate=10.00MB mem-reservation=0B PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 12:AGGREGATE [FINALIZE] | output: sum:merge(l_extendedprice) | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB | tuple-ids=6 row-size=16B cardinality=1 | 11:EXCHANGE [UNPARTITIONED] mem-estimate=0B mem-reservation=0B tuple-ids=6 row-size=16B cardinality=1 F00:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 Per-Host Resources: mem-estimate=527.71MB mem-reservation=35.94MB DATASTREAM SINK [FRAGMENT=F04, EXCHANGE=11, UNPARTITIONED] | mem-estimate=0B mem-reservation=0B 06:AGGREGATE | output: sum(l_extendedprice) | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB | tuple-ids=6 row-size=16B cardinality=1 | 05:HASH JOIN [LEFT SEMI JOIN, BROADCAST] | hash predicates: p_partkey = l_partkey | other join predicates: l_quantity < 0.2 * avg(l_quantity) | runtime filters: RF000[bloom] <- l_partkey | mem-estimate=251.77MB mem-reservation=34.00MB spill-buffer=2.00MB | tuple-ids=0,1 row-size=80B cardinality=15000000 | |--10:EXCHANGE [BROADCAST] | mem-estimate=0B mem-reservation=0B | tuple-ids=4 row-size=16B cardinality=15000000 | 04:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: l_partkey = p_partkey | fk/pk conjuncts: assumed fk/pk | runtime filters: RF002[bloom] <- p_partkey | mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB | tuple-ids=0,1 row-size=80B cardinality=15000000 | |--07:EXCHANGE [BROADCAST] | mem-estimate=0B mem-reservation=0B | tuple-ids=1 row-size=56B cardinality=1000 | 00:SCAN HDFS [tpch_nested_parquet.customer.c_orders.o_lineitems l, RANDOM] partitions=1/1 files=4 size=292.36MB runtime filters: RF000[bloom] -> l.l_partkey, RF002[bloom] -> l_partkey stats-rows=150000 extrapolated-rows=disabled table stats: rows=150000 size=292.36MB column stats: all mem-estimate=264.00MB mem-reservation=0B tuple-ids=0 row-size=24B cardinality=15000000 F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1 Per-Host Resources: mem-estimate=48.00MB mem-reservation=0B DATASTREAM SINK [FRAGMENT=F00, EXCHANGE=07, BROADCAST] | mem-estimate=0B mem-reservation=0B 01:SCAN HDFS [tpch_nested_parquet.part p, RANDOM] partitions=1/1 files=1 size=6.23MB predicates: p_container = 'MED BOX', p_brand = 'Brand#23' runtime filters: RF000[bloom] -> p_partkey stats-rows=200000 extrapolated-rows=disabled table stats: rows=200000 size=6.23MB column stats: all parquet statistics predicates: p_container = 'MED BOX', p_brand = 'Brand#23' parquet dictionary predicates: p_container = 'MED BOX', p_brand = 'Brand#23' mem-estimate=48.00MB mem-reservation=0B tuple-ids=1 row-size=56B cardinality=1000 F03:PLAN FRAGMENT [HASH(l_partkey)] hosts=2 instances=2 Per-Host Resources: mem-estimate=128.00MB mem-reservation=34.00MB DATASTREAM SINK [FRAGMENT=F00, EXCHANGE=10, BROADCAST] | mem-estimate=0B mem-reservation=0B 09:AGGREGATE [FINALIZE] | output: avg:merge(l_quantity) | group by: l_partkey | mem-estimate=128.00MB mem-reservation=34.00MB spill-buffer=2.00MB | tuple-ids=4 row-size=16B cardinality=15000000 | 08:EXCHANGE [HASH(l_partkey)] mem-estimate=0B mem-reservation=0B tuple-ids=3 row-size=16B cardinality=15000000 F02:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 Per-Host Resources: mem-estimate=304.00MB mem-reservation=34.00MB DATASTREAM SINK [FRAGMENT=F03, EXCHANGE=08, HASH(l_partkey)] | mem-estimate=0B mem-reservation=0B 03:AGGREGATE [STREAMING] | output: avg(l_quantity) | group by: l_partkey | mem-estimate=128.00MB mem-reservation=34.00MB spill-buffer=2.00MB | tuple-ids=3 row-size=16B cardinality=15000000 | 02:SCAN HDFS [tpch_nested_parquet.customer.c_orders.o_lineitems l, RANDOM] partitions=1/1 files=4 size=292.36MB stats-rows=150000 extrapolated-rows=disabled table stats: rows=150000 size=292.36MB column stats: all mem-estimate=176.00MB mem-reservation=0B tuple-ids=2 row-size=16B cardinality=15000000 Stacktrace java.lang.AssertionError: Section DISTRIBUTEDPLAN of query: select sum(l_extendedprice) / 7.0 as avg_yearly from customer.c_orders.o_lineitems l, part p where p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' and l_quantity < ( select 0.2 * avg(l_quantity) from customer.c_orders.o_lineitems l where l_partkey = p_partkey ) Actual does not match expected result: PLAN-ROOT SINK | 12:AGGREGATE [FINALIZE] | output: sum:merge(l_extendedprice) | 11:EXCHANGE [UNPARTITIONED] | 06:AGGREGATE | output: sum(l_extendedprice) | 05:HASH JOIN [LEFT SEMI JOIN, BROADCAST] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | hash predicates: p_partkey = l_partkey | other join predicates: l_quantity < 0.2 * avg(l_quantity) | runtime filters: RF000 <- l_partkey | |--10:EXCHANGE [BROADCAST] | | | 09:AGGREGATE [FINALIZE] | | output: avg:merge(l_quantity) | | group by: l_partkey | | | 08:EXCHANGE [HASH(l_partkey)] | | | 03:AGGREGATE [STREAMING] | | output: avg(l_quantity) | | group by: l_partkey | | | 02:SCAN HDFS [tpch_nested_parquet.customer.c_orders.o_lineitems l] | partitions=1/1 files=4 size=292.36MB | 04:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: l_partkey = p_partkey | runtime filters: RF002 <- p_partkey | |--07:EXCHANGE [BROADCAST] | | | 01:SCAN HDFS [tpch_nested_parquet.part p] | partitions=1/1 files=1 size=6.23MB | predicates: p_container = 'MED BOX', p_brand = 'Brand#23' | runtime filters: RF000 -> p_partkey | 00:SCAN HDFS [tpch_nested_parquet.customer.c_orders.o_lineitems l] partitions=1/1 files=4 size=292.36MB runtime filters: RF000 -> l.l_partkey, RF002 -> l_partkey Expected: PLAN-ROOT SINK | 12:AGGREGATE [FINALIZE] | output: sum:merge(l_extendedprice) | 11:EXCHANGE [UNPARTITIONED] | 06:AGGREGATE | output: sum(l_extendedprice) | 05:HASH JOIN [LEFT SEMI JOIN, PARTITIONED] | hash predicates: p_partkey = l_partkey | other join predicates: l_quantity < 0.2 * avg(l_quantity) | runtime filters: RF000 <- l_partkey | |--09:AGGREGATE [FINALIZE] | | output: avg:merge(l_quantity) | | group by: l_partkey | | | 08:EXCHANGE [HASH(l_partkey)] | | | 03:AGGREGATE [STREAMING] | | output: avg(l_quantity) | | group by: l_partkey | | | 02:SCAN HDFS [tpch_nested_parquet.customer.c_orders.o_lineitems l] | partitions=1/1 files=4 size=292.36MB | 10:EXCHANGE [HASH(p_partkey)] | 04:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: l_partkey = p_partkey | runtime filters: RF002 <- p_partkey | |--07:EXCHANGE [BROADCAST] | | | 01:SCAN HDFS [tpch_nested_parquet.part p] | partitions=1/1 files=1 size=6.24MB | predicates: p_container = 'MED BOX', p_brand = 'Brand#23' | runtime filters: RF000 -> p_partkey | 00:SCAN HDFS [tpch_nested_parquet.customer.c_orders.o_lineitems l] partitions=1/1 files=4 size=292.36MB runtime filters: RF000 -> l.l_partkey, RF002 -> l_partkey Verbose plan: F04:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 Per-Host Resources: mem-estimate=10.00MB mem-reservation=0B PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B | 12:AGGREGATE [FINALIZE] | output: sum:merge(l_extendedprice) | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB | tuple-ids=6 row-size=16B cardinality=1 | 11:EXCHANGE [UNPARTITIONED] mem-estimate=0B mem-reservation=0B tuple-ids=6 row-size=16B cardinality=1 F00:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 Per-Host Resources: mem-estimate=527.71MB mem-reservation=35.94MB DATASTREAM SINK [FRAGMENT=F04, EXCHANGE=11, UNPARTITIONED] | mem-estimate=0B mem-reservation=0B 06:AGGREGATE | output: sum(l_extendedprice) | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB | tuple-ids=6 row-size=16B cardinality=1 | 05:HASH JOIN [LEFT SEMI JOIN, BROADCAST] | hash predicates: p_partkey = l_partkey | other join predicates: l_quantity < 0.2 * avg(l_quantity) | runtime filters: RF000[bloom] <- l_partkey | mem-estimate=251.77MB mem-reservation=34.00MB spill-buffer=2.00MB | tuple-ids=0,1 row-size=80B cardinality=15000000 | |--10:EXCHANGE [BROADCAST] | mem-estimate=0B mem-reservation=0B | tuple-ids=4 row-size=16B cardinality=15000000 | 04:HASH JOIN [INNER JOIN, BROADCAST] | hash predicates: l_partkey = p_partkey | fk/pk conjuncts: assumed fk/pk | runtime filters: RF002[bloom] <- p_partkey | mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB | tuple-ids=0,1 row-size=80B cardinality=15000000 | |--07:EXCHANGE [BROADCAST] | mem-estimate=0B mem-reservation=0B | tuple-ids=1 row-size=56B cardinality=1000 | 00:SCAN HDFS [tpch_nested_parquet.customer.c_orders.o_lineitems l, RANDOM] partitions=1/1 files=4 size=292.36MB runtime filters: RF000[bloom] -> l.l_partkey, RF002[bloom] -> l_partkey stats-rows=150000 extrapolated-rows=disabled table stats: rows=150000 size=292.36MB column stats: all mem-estimate=264.00MB mem-reservation=0B tuple-ids=0 row-size=24B cardinality=15000000 F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1 Per-Host Resources: mem-estimate=48.00MB mem-reservation=0B DATASTREAM SINK [FRAGMENT=F00, EXCHANGE=07, BROADCAST] | mem-estimate=0B mem-reservation=0B 01:SCAN HDFS [tpch_nested_parquet.part p, RANDOM] partitions=1/1 files=1 size=6.23MB predicates: p_container = 'MED BOX', p_brand = 'Brand#23' runtime filters: RF000[bloom] -> p_partkey stats-rows=200000 extrapolated-rows=disabled table stats: rows=200000 size=6.23MB column stats: all parquet statistics predicates: p_container = 'MED BOX', p_brand = 'Brand#23' parquet dictionary predicates: p_container = 'MED BOX', p_brand = 'Brand#23' mem-estimate=48.00MB mem-reservation=0B tuple-ids=1 row-size=56B cardinality=1000 F03:PLAN FRAGMENT [HASH(l_partkey)] hosts=2 instances=2 Per-Host Resources: mem-estimate=128.00MB mem-reservation=34.00MB DATASTREAM SINK [FRAGMENT=F00, EXCHANGE=10, BROADCAST] | mem-estimate=0B mem-reservation=0B 09:AGGREGATE [FINALIZE] | output: avg:merge(l_quantity) | group by: l_partkey | mem-estimate=128.00MB mem-reservation=34.00MB spill-buffer=2.00MB | tuple-ids=4 row-size=16B cardinality=15000000 | 08:EXCHANGE [HASH(l_partkey)] mem-estimate=0B mem-reservation=0B tuple-ids=3 row-size=16B cardinality=15000000 F02:PLAN FRAGMENT [RANDOM] hosts=2 instances=2 Per-Host Resources: mem-estimate=304.00MB mem-reservation=34.00MB DATASTREAM SINK [FRAGMENT=F03, EXCHANGE=08, HASH(l_partkey)] | mem-estimate=0B mem-reservation=0B 03:AGGREGATE [STREAMING] | output: avg(l_quantity) | group by: l_partkey | mem-estimate=128.00MB mem-reservation=34.00MB spill-buffer=2.00MB | tuple-ids=3 row-size=16B cardinality=15000000 | 02:SCAN HDFS [tpch_nested_parquet.customer.c_orders.o_lineitems l, RANDOM] partitions=1/1 files=4 size=292.36MB stats-rows=150000 extrapolated-rows=disabled table stats: rows=150000 size=292.36MB column stats: all mem-estimate=176.00MB mem-reservation=0B tuple-ids=2 row-size=16B cardinality=15000000 at org.junit.Assert.fail(Assert.java:88) at org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:770) at org.apache.impala.planner.PlannerTestBase.runPlannerTestFile(PlannerTestBase.java:779) at org.apache.impala.planner.PlannerTest.testTpchNested(PlannerTest.java:245) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)