[
https://issues.apache.org/jira/browse/DRILL-7154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Hou reassigned DRILL-7154:
---------------------------------
Assignee: Hanumath Rao Maduri (was: Boaz Ben-Zvi)
> TPCH query 4, 17 and 18 take longer with sf 1000 when Statistics are disabled
> -----------------------------------------------------------------------------
>
> Key: DRILL-7154
> URL: https://issues.apache.org/jira/browse/DRILL-7154
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Affects Versions: 1.16.0
> Reporter: Robert Hou
> Assignee: Hanumath Rao Maduri
> Priority: Blocker
> Fix For: 1.16.0
>
> Attachments: 235a3ed4-e3d1-f3b7-39c5-fc947f56b6d5.sys.drill,
> 235a471b-aa97-bfb5-207d-3f25b4b5fbbb.sys.drill, hashagg.nostats.data.log,
> hashagg.nostats.foreman.log, hashagg.stats.disabled.data.log,
> hashagg.stats.disabled.foreman.log
>
>
> Here is TPCH 04 with sf 1000:
> {noformat}
> select
> o.o_orderpriority,
> count(*) as order_count
> from
> orders o
> where
> o.o_orderdate >= date '1996-10-01'
> and o.o_orderdate < date '1996-10-01' + interval '3' month
> and
> exists (
> select
> *
> from
> lineitem l
> where
> l.l_orderkey = o.o_orderkey
> and l.l_commitdate < l.l_receiptdate
> )
> group by
> o.o_orderpriority
> order by
> o.o_orderpriority;
> {noformat}
> TPCH query 4 takes 30% longer. The plan is the same. But the Hash Agg
> operator in the new plan is taking longer. One possible reason is that the
> Hash Agg operator in the new plan is not using as many buckets as the old
> plan did. The memory usage of the Hash Agg operator in the new plan is using
> less memory compared to the old plan.
> Here is the old plan:
> {noformat}
> 00-00 Screen : rowType = RecordType(ANY o_orderpriority, BIGINT
> order_count): rowcount = 3750000.0, cumulative cost = {1.9163601940441746E10
> rows, 9.07316867594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12
> network, 2.2631985057468002E10 memory}, id = 5645
> 00-01 Project(o_orderpriority=[$0], order_count=[$1]) : rowType =
> RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0,
> cumulative cost = {1.9163226940441746E10 rows, 9.07313117594483E10 cpu,
> 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10
> memory}, id = 5644
> 00-02 SingleMergeExchange(sort0=[0]) : rowType = RecordType(ANY
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost =
> {1.9159476940441746E10 rows, 9.07238117594483E10 cpu, 2.2499969127E10 io,
> 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5643
> 01-01 OrderedMuxExchange(sort0=[0]) : rowType = RecordType(ANY
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost =
> {1.9155726940441746E10 rows, 9.0643982838025E10 cpu, 2.2499969127E10 io,
> 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5642
> 02-01 SelectionVectorRemover : rowType = RecordType(ANY
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost =
> {1.9151976940441746E10 rows, 9.0640232838025E10 cpu, 2.2499969127E10 io,
> 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5641
> 02-02 Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost =
> {1.9148226940441746E10 rows, 9.0636482838025E10 cpu, 2.2499969127E10 io,
> 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5640
> 02-03 HashAgg(group=[{0}], order_count=[$SUM0($1)]) : rowType
> = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0,
> cumulative cost = {1.9144476940441746E10 rows, 9.030890595055101E10 cpu,
> 2.2499969127E10 io, 3.56351968386048E12 network, 2.2571985057468002E10
> memory}, id = 5639
> 02-04 HashToRandomExchange(dist0=[[$0]]) : rowType =
> RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7,
> cumulative cost = {1.9106976940441746E10 rows, 8.955890595055101E10 cpu,
> 2.2499969127E10 io, 3.56351968386048E12 network, 2.1911985057468002E10
> memory}, id = 5638
> 03-01 HashAgg(group=[{0}], order_count=[COUNT()]) :
> rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount =
> 3.75E7, cumulative cost = {1.9069476940441746E10 rows, 8.895890595055101E10
> cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 2.1911985057468002E10
> memory}, id = 5637
> 03-02 Project(o_orderpriority=[$1]) : rowType =
> RecordType(ANY o_orderpriority): rowcount = 3.75E8, cumulative cost =
> {1.8694476940441746E10 rows, 8.145890595055101E10 cpu, 2.2499969127E10 io,
> 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5636
> 03-03 Project(o_orderkey=[$1], o_orderpriority=[$2],
> l_orderkey=[$0]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority,
> ANY l_orderkey): rowcount = 3.75E8, cumulative cost = {1.8319476940441746E10
> rows, 8.108390595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12
> network, 1.5311985057468002E10 memory}, id = 5635
> 03-04 HashJoin(condition=[=($1, $0)],
> joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_orderkey,
> ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost =
> {1.7944476940441746E10 rows, 7.995890595055101E10 cpu, 2.2499969127E10 io,
> 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5634
> 03-05 HashToRandomExchange(dist0=[[$0]]) : rowType
> = RecordType(ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8,
> cumulative cost = {4.125E9 rows, 2.2125E10 cpu, 4.5E9 io, 3.072E12 network,
> 0.0 memory}, id = 5633
> 05-01 Project(o_orderkey=[$1],
> o_orderpriority=[$2]) : rowType = RecordType(ANY o_orderkey, ANY
> o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.75E9 rows,
> 1.6125E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 5632
> 05-02 SelectionVectorRemover : rowType =
> RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount =
> 3.75E8, cumulative cost = {3.375E9 rows, 1.5375E10 cpu, 4.5E9 io, 0.0
> network, 0.0 memory}, id = 5631
> 05-03 Filter(condition=[AND(>=($0,
> 1996-10-01), <($0, 1997-01-01 00:00:00))]) : rowType = RecordType(ANY
> o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8,
> cumulative cost = {3.0E9 rows, 1.5E10 cpu, 4.5E9 io, 0.0 network, 0.0
> memory}, id = 5630
> 05-04 Scan(table=[[dfs, tpchpar1000_micro,
> orders]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
> [path=maprfs:///tpchParquet10/SF1000/orders]],
> selectionRoot=maprfs:/tpchParquet10/SF1000/orders, numFiles=1,
> numRowGroups=750, usedMetadataFile=false, columns=[`o_orderdate`,
> `o_orderkey`, `o_orderpriority`]]]) : rowType = RecordType(ANY o_orderdate,
> ANY o_orderkey, ANY o_orderpriority): rowcount = 1.5E9, cumulative cost =
> {1.5E9 rows, 4.5E9 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 5629
> 03-06 HashAgg(group=[{0}]) : rowType =
> RecordType(ANY l_orderkey): rowcount = 4499992.28175, cumulative cost =
> {1.3439976948159996E10 rows, 5.477990604317001E10 cpu, 1.7999969127E10 io,
> 1.8431968386048E11 network, 8.711985057468E9 memory}, id = 5628
> 03-07 HashToRandomExchange(dist0=[[$0]]) :
> rowType = RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative
> cost = {1.3394977025342497E10 rows, 5.441990666063001E10 cpu, 1.7999969127E10
> io, 1.8431968386048E11 network, 7.919986415880001E9 memory}, id = 5627
> 04-01 HashAgg(group=[{0}]) : rowType =
> RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative cost =
> {1.3349977102524998E10 rows, 5.369990789555001E10 cpu, 1.7999969127E10 io,
> 0.0 network, 7.919986415880001E9 memory}, id = 5626
> 04-02 Project(l_orderkey=[$0], i=[true]) :
> rowType = RecordType(ANY l_orderkey, BOOLEAN i): rowcount = 4.49999228175E8,
> cumulative cost = {1.2899977874349998E10 rows, 5.009991407015001E10 cpu,
> 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 5625
> 04-03 SelectionVectorRemover : rowType =
> RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount =
> 4.49999228175E8, cumulative cost = {1.2449978646175E10 rows,
> 4.784991792927501E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id =
> 5624
> 04-04 Filter(condition=[AND(=($0, $0),
> <($1, $2))]) : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY
> l_receiptdate): rowcount = 4.49999228175E8, cumulative cost =
> {1.1999979418E10 rows, 4.7399918701100006E10 cpu, 1.7999969127E10 io, 0.0
> network, 0.0 memory}, id = 5623
> 04-05 Scan(table=[[dfs,
> tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]],
> selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1,
> numRowGroups=3250, usedMetadataFile=false, columns=[`l_orderkey`,
> `l_commitdate`, `l_receiptdate`]]]) : rowType = RecordType(ANY l_orderkey,
> ANY l_commitdate, ANY l_receiptdate): rowcount = 5.999989709E9, cumulative
> cost = {5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0
> network, 0.0 memory}, id = 5622
> {noformat}
> Here is the new plan:
> {noformat}
> 00-00 Screen : rowType = RecordType(ANY o_orderpriority, BIGINT
> order_count): rowcount = 3750000.0, cumulative cost = {1.9163601940441746E10
> rows, 9.07316867594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12
> network, 2.2631985057468002E10 memory}, id = 11739
> 00-01 Project(o_orderpriority=[$0], order_count=[$1]) : rowType =
> RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0,
> cumulative cost = {1.9163226940441746E10 rows, 9.07313117594483E10 cpu,
> 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10
> memory}, id = 11738
> 00-02 SingleMergeExchange(sort0=[0]) : rowType = RecordType(ANY
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost =
> {1.9159476940441746E10 rows, 9.07238117594483E10 cpu, 2.2499969127E10 io,
> 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 11737
> 01-01 OrderedMuxExchange(sort0=[0]) : rowType = RecordType(ANY
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost =
> {1.9155726940441746E10 rows, 9.0643982838025E10 cpu, 2.2499969127E10 io,
> 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 11736
> 02-01 SelectionVectorRemover : rowType = RecordType(ANY
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost =
> {1.9151976940441746E10 rows, 9.0640232838025E10 cpu, 2.2499969127E10 io,
> 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 11735
> 02-02 Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost =
> {1.9148226940441746E10 rows, 9.0636482838025E10 cpu, 2.2499969127E10 io,
> 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 11734
> 02-03 HashAgg(group=[{0}], order_count=[$SUM0($1)]) : rowType
> = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0,
> cumulative cost = {1.9144476940441746E10 rows, 9.030890595055101E10 cpu,
> 2.2499969127E10 io, 3.56351968386048E12 network, 2.2571985057468002E10
> memory}, id = 11733
> 02-04 HashToRandomExchange(dist0=[[$0]]) : rowType =
> RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7,
> cumulative cost = {1.9106976940441746E10 rows, 8.955890595055101E10 cpu,
> 2.2499969127E10 io, 3.56351968386048E12 network, 2.1911985057468002E10
> memory}, id = 11732
> 03-01 HashAgg(group=[{0}], order_count=[COUNT()]) :
> rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount =
> 3.75E7, cumulative cost = {1.9069476940441746E10 rows, 8.895890595055101E10
> cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 2.1911985057468002E10
> memory}, id = 11731
> 03-02 Project(o_orderpriority=[$1]) : rowType =
> RecordType(ANY o_orderpriority): rowcount = 3.75E8, cumulative cost =
> {1.8694476940441746E10 rows, 8.145890595055101E10 cpu, 2.2499969127E10 io,
> 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 11730
> 03-03 Project(o_orderkey=[$1], o_orderpriority=[$2],
> l_orderkey=[$0]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority,
> ANY l_orderkey): rowcount = 3.75E8, cumulative cost = {1.8319476940441746E10
> rows, 8.108390595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12
> network, 1.5311985057468002E10 memory}, id = 11729
> 03-04 HashJoin(condition=[=($1, $0)],
> joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_orderkey,
> ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost =
> {1.7944476940441746E10 rows, 7.995890595055101E10 cpu, 2.2499969127E10 io,
> 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 11728
> 03-05 HashToRandomExchange(dist0=[[$0]]) : rowType
> = RecordType(ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8,
> cumulative cost = {4.125E9 rows, 2.2125E10 cpu, 4.5E9 io, 3.072E12 network,
> 0.0 memory}, id = 11727
> 05-01 Project(o_orderkey=[$1],
> o_orderpriority=[$2]) : rowType = RecordType(ANY o_orderkey, ANY
> o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.75E9 rows,
> 1.6125E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 11726
> 05-02 SelectionVectorRemover : rowType =
> RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount =
> 3.75E8, cumulative cost = {3.375E9 rows, 1.5375E10 cpu, 4.5E9 io, 0.0
> network, 0.0 memory}, id = 11725
> 05-03 Filter(condition=[AND(>=($0,
> 1996-10-01), <($0, 1997-01-01 00:00:00))]) : rowType = RecordType(ANY
> o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8,
> cumulative cost = {3.0E9 rows, 1.5E10 cpu, 4.5E9 io, 0.0 network, 0.0
> memory}, id = 11724
> 05-04 Scan(table=[[dfs, tpchpar1000_micro,
> orders]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath
> [path=maprfs:///tpchParquet10/SF1000/orders]],
> selectionRoot=maprfs:/tpchParquet10/SF1000/orders, numFiles=1,
> numRowGroups=750, usedMetadataFile=false, columns=[`o_orderdate`,
> `o_orderkey`, `o_orderpriority`]]]) : rowType = RecordType(ANY o_orderdate,
> ANY o_orderkey, ANY o_orderpriority): rowcount = 1.5E9, cumulative cost =
> {1.5E9 rows, 4.5E9 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 11723
> 03-06 HashAgg(group=[{0}]) : rowType =
> RecordType(ANY l_orderkey): rowcount = 4499992.28175, cumulative cost =
> {1.3439976948159996E10 rows, 5.477990604317001E10 cpu, 1.7999969127E10 io,
> 1.8431968386048E11 network, 8.711985057468E9 memory}, id = 11722
> 03-07 HashToRandomExchange(dist0=[[$0]]) :
> rowType = RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative
> cost = {1.3394977025342497E10 rows, 5.441990666063001E10 cpu, 1.7999969127E10
> io, 1.8431968386048E11 network, 7.919986415880001E9 memory}, id = 11721
> 04-01 HashAgg(group=[{0}]) : rowType =
> RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative cost =
> {1.3349977102524998E10 rows, 5.369990789555001E10 cpu, 1.7999969127E10 io,
> 0.0 network, 7.919986415880001E9 memory}, id = 11720
> 04-02 Project(l_orderkey=[$0], i=[true]) :
> rowType = RecordType(ANY l_orderkey, BOOLEAN i): rowcount = 4.49999228175E8,
> cumulative cost = {1.2899977874349998E10 rows, 5.009991407015001E10 cpu,
> 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 11719
> 04-03 SelectionVectorRemover : rowType =
> RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount =
> 4.49999228175E8, cumulative cost = {1.2449978646175E10 rows,
> 4.784991792927501E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id =
> 11718
> 04-04 Filter(condition=[AND(=($0, $0),
> <($1, $2))]) : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY
> l_receiptdate): rowcount = 4.49999228175E8, cumulative cost =
> {1.1999979418E10 rows, 4.7399918701100006E10 cpu, 1.7999969127E10 io, 0.0
> network, 0.0 memory}, id = 11717
> 04-05 Scan(table=[[dfs,
> tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]],
> selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1,
> numRowGroups=3250, usedMetadataFile=false, columns=[`l_orderkey`,
> `l_commitdate`, `l_receiptdate`]]]) : rowType = RecordType(ANY l_orderkey,
> ANY l_commitdate, ANY l_receiptdate): rowcount = 5.999989709E9, cumulative
> cost = {5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0
> network, 0.0 memory}, id = 11716
> {noformat}
> I have attached two profiles. 235a471b-aa97-bfb5-207d-3f25b4b5fbbb is from
> commit id 4627973bde9847a4eb2672c44941136c167326a1. This does not have
> Statistics code and serves as the baseline. It is the commit prior to the
> Statistics commit. 235a3ed4-e3d1-f3b7-39c5-fc947f56b6d5 is from commit id
> 5c436dbb028b813e80b7b8fcf045af31f0bcf68b. This has the Statistics code with
> the fix for disabled Statistics.
> I also pulled the logs from the foreman to show the memory limit for Hash
> Agg. With the baseline, HashAgg has a limit of 10GB:
> {noformat}
> 2019-04-04 00:57:27,446 [235a471b-aa97-bfb5-207d-3f25b4b5fbbb:frag:4:142]
> TRACE o.a.d.e.p.i.aggregate.HashAggregator - 1st phase. Estimated internal
> row width: 8 Values row width: 0 batch size: 720896 memory limit:
> 10000000000 max column width: 8
> {noformat}
> With statistics disabled, HashAgg has a limit of 64MB:
> {noformat}
> 2019-04-04 01:32:48,132 [235a3ed4-e3d1-f3b7-39c5-fc947f56b6d5:frag:4:182]
> TRACE o.a.d.e.p.i.aggregate.HashAggregator - 1st phase. Estimated internal
> row width: 8 Values row width: 0 batch size: 720896 memory limit: 65075262
> max column width: 8
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)