[jira] [Updated] (DRILL-7154) TPCH query 4, 17 and 18 take longer with sf 1000 when Statistics are disabled

Sorabh Hamirwasia (JIRA) Mon, 08 Apr 2019 18:31:27 -0700


     [ 
https://issues.apache.org/jira/browse/DRILL-7154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sorabh Hamirwasia updated DRILL-7154:
-------------------------------------
    Labels: ready-to-commit  (was: )

> TPCH query 4, 17 and 18 take longer with sf 1000 when Statistics are disabled
> -----------------------------------------------------------------------------
>
>                 Key: DRILL-7154
>                 URL: https://issues.apache.org/jira/browse/DRILL-7154
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning &amp; Optimization
>    Affects Versions: 1.16.0
>            Reporter: Robert Hou
>            Assignee: Hanumath Rao Maduri
>            Priority: Blocker
>              Labels: ready-to-commit
>             Fix For: 1.16.0
>
>         Attachments: 235a3ed4-e3d1-f3b7-39c5-fc947f56b6d5.sys.drill, 
> 235a471b-aa97-bfb5-207d-3f25b4b5fbbb.sys.drill, hashagg.nostats.data.log, 
> hashagg.nostats.foreman.log, hashagg.stats.disabled.data.log, 
> hashagg.stats.disabled.foreman.log
>
>
> Here is TPCH 04 with sf 1000:
> {noformat}
> select
>   o.o_orderpriority,
>   count(*) as order_count
> from
>   orders o
> where
>   o.o_orderdate >= date '1996-10-01'
>   and o.o_orderdate < date '1996-10-01' + interval '3' month
>   and 
>   exists (
>     select
>       *
>     from
>       lineitem l
>     where
>       l.l_orderkey = o.o_orderkey
>       and l.l_commitdate < l.l_receiptdate
>   )
> group by
>   o.o_orderpriority
> order by
>   o.o_orderpriority;
> {noformat}
> TPCH query 4 takes 30% longer.  The plan is the same.  But the Hash Agg 
> operator in the new plan is taking longer.  One possible reason is that the 
> Hash Agg operator in the new plan is not using as many buckets as the old 
> plan did.  The memory usage of the Hash Agg operator in the new plan is using 
> less memory compared to the old plan.
> Here is the old plan:
> {noformat}
> 00-00    Screen : rowType = RecordType(ANY o_orderpriority, BIGINT 
> order_count): rowcount = 3750000.0, cumulative cost = {1.9163601940441746E10 
> rows, 9.07316867594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 
> network, 2.2631985057468002E10 memory}, id = 5645
> 00-01      Project(o_orderpriority=[$0], order_count=[$1]) : rowType = 
> RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, 
> cumulative cost = {1.9163226940441746E10 rows, 9.07313117594483E10 cpu, 
> 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 
> memory}, id = 5644
> 00-02        SingleMergeExchange(sort0=[0]) : rowType = RecordType(ANY 
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = 
> {1.9159476940441746E10 rows, 9.07238117594483E10 cpu, 2.2499969127E10 io, 
> 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5643
> 01-01          OrderedMuxExchange(sort0=[0]) : rowType = RecordType(ANY 
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = 
> {1.9155726940441746E10 rows, 9.0643982838025E10 cpu, 2.2499969127E10 io, 
> 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5642
> 02-01            SelectionVectorRemover : rowType = RecordType(ANY 
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = 
> {1.9151976940441746E10 rows, 9.0640232838025E10 cpu, 2.2499969127E10 io, 
> 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5641
> 02-02              Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY 
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = 
> {1.9148226940441746E10 rows, 9.0636482838025E10 cpu, 2.2499969127E10 io, 
> 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5640
> 02-03                HashAgg(group=[{0}], order_count=[$SUM0($1)]) : rowType 
> = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, 
> cumulative cost = {1.9144476940441746E10 rows, 9.030890595055101E10 cpu, 
> 2.2499969127E10 io, 3.56351968386048E12 network, 2.2571985057468002E10 
> memory}, id = 5639
> 02-04                  HashToRandomExchange(dist0=[[$0]]) : rowType = 
> RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, 
> cumulative cost = {1.9106976940441746E10 rows, 8.955890595055101E10 cpu, 
> 2.2499969127E10 io, 3.56351968386048E12 network, 2.1911985057468002E10 
> memory}, id = 5638
> 03-01                    HashAgg(group=[{0}], order_count=[COUNT()]) : 
> rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 
> 3.75E7, cumulative cost = {1.9069476940441746E10 rows, 8.895890595055101E10 
> cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 2.1911985057468002E10 
> memory}, id = 5637
> 03-02                      Project(o_orderpriority=[$1]) : rowType = 
> RecordType(ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = 
> {1.8694476940441746E10 rows, 8.145890595055101E10 cpu, 2.2499969127E10 io, 
> 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5636
> 03-03                        Project(o_orderkey=[$1], o_orderpriority=[$2], 
> l_orderkey=[$0]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority, 
> ANY l_orderkey): rowcount = 3.75E8, cumulative cost = {1.8319476940441746E10 
> rows, 8.108390595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 
> network, 1.5311985057468002E10 memory}, id = 5635
> 03-04                          HashJoin(condition=[=($1, $0)], 
> joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_orderkey, 
> ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = 
> {1.7944476940441746E10 rows, 7.995890595055101E10 cpu, 2.2499969127E10 io, 
> 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5634
> 03-05                            HashToRandomExchange(dist0=[[$0]]) : rowType 
> = RecordType(ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, 
> cumulative cost = {4.125E9 rows, 2.2125E10 cpu, 4.5E9 io, 3.072E12 network, 
> 0.0 memory}, id = 5633
> 05-01                              Project(o_orderkey=[$1], 
> o_orderpriority=[$2]) : rowType = RecordType(ANY o_orderkey, ANY 
> o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.75E9 rows, 
> 1.6125E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 5632
> 05-02                                SelectionVectorRemover : rowType = 
> RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 
> 3.75E8, cumulative cost = {3.375E9 rows, 1.5375E10 cpu, 4.5E9 io, 0.0 
> network, 0.0 memory}, id = 5631
> 05-03                                  Filter(condition=[AND(>=($0, 
> 1996-10-01), <($0, 1997-01-01 00:00:00))]) : rowType = RecordType(ANY 
> o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, 
> cumulative cost = {3.0E9 rows, 1.5E10 cpu, 4.5E9 io, 0.0 network, 0.0 
> memory}, id = 5630
> 05-04                                    Scan(table=[[dfs, tpchpar1000_micro, 
> orders]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=maprfs:///tpchParquet10/SF1000/orders]], 
> selectionRoot=maprfs:/tpchParquet10/SF1000/orders, numFiles=1, 
> numRowGroups=750, usedMetadataFile=false, columns=[`o_orderdate`, 
> `o_orderkey`, `o_orderpriority`]]]) : rowType = RecordType(ANY o_orderdate, 
> ANY o_orderkey, ANY o_orderpriority): rowcount = 1.5E9, cumulative cost = 
> {1.5E9 rows, 4.5E9 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 5629
> 03-06                            HashAgg(group=[{0}]) : rowType = 
> RecordType(ANY l_orderkey): rowcount = 4499992.28175, cumulative cost = 
> {1.3439976948159996E10 rows, 5.477990604317001E10 cpu, 1.7999969127E10 io, 
> 1.8431968386048E11 network, 8.711985057468E9 memory}, id = 5628
> 03-07                              HashToRandomExchange(dist0=[[$0]]) : 
> rowType = RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative 
> cost = {1.3394977025342497E10 rows, 5.441990666063001E10 cpu, 1.7999969127E10 
> io, 1.8431968386048E11 network, 7.919986415880001E9 memory}, id = 5627
> 04-01                                HashAgg(group=[{0}]) : rowType = 
> RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative cost = 
> {1.3349977102524998E10 rows, 5.369990789555001E10 cpu, 1.7999969127E10 io, 
> 0.0 network, 7.919986415880001E9 memory}, id = 5626
> 04-02                                  Project(l_orderkey=[$0], i=[true]) : 
> rowType = RecordType(ANY l_orderkey, BOOLEAN i): rowcount = 4.49999228175E8, 
> cumulative cost = {1.2899977874349998E10 rows, 5.009991407015001E10 cpu, 
> 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 5625
> 04-03                                    SelectionVectorRemover : rowType = 
> RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 
> 4.49999228175E8, cumulative cost = {1.2449978646175E10 rows, 
> 4.784991792927501E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 
> 5624
> 04-04                                      Filter(condition=[AND(=($0, $0), 
> <($1, $2))]) : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY 
> l_receiptdate): rowcount = 4.49999228175E8, cumulative cost = 
> {1.1999979418E10 rows, 4.7399918701100006E10 cpu, 1.7999969127E10 io, 0.0 
> network, 0.0 memory}, id = 5623
> 04-05                                        Scan(table=[[dfs, 
> tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], 
> selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, 
> numRowGroups=3250, usedMetadataFile=false, columns=[`l_orderkey`, 
> `l_commitdate`, `l_receiptdate`]]]) : rowType = RecordType(ANY l_orderkey, 
> ANY l_commitdate, ANY l_receiptdate): rowcount = 5.999989709E9, cumulative 
> cost = {5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0 
> network, 0.0 memory}, id = 5622
> {noformat}
> Here is the new plan:
> {noformat}
> 00-00    Screen : rowType = RecordType(ANY o_orderpriority, BIGINT 
> order_count): rowcount = 3750000.0, cumulative cost = {1.9163601940441746E10 
> rows, 9.07316867594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 
> network, 2.2631985057468002E10 memory}, id = 11739
> 00-01      Project(o_orderpriority=[$0], order_count=[$1]) : rowType = 
> RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, 
> cumulative cost = {1.9163226940441746E10 rows, 9.07313117594483E10 cpu, 
> 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 
> memory}, id = 11738
> 00-02        SingleMergeExchange(sort0=[0]) : rowType = RecordType(ANY 
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = 
> {1.9159476940441746E10 rows, 9.07238117594483E10 cpu, 2.2499969127E10 io, 
> 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 11737
> 01-01          OrderedMuxExchange(sort0=[0]) : rowType = RecordType(ANY 
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = 
> {1.9155726940441746E10 rows, 9.0643982838025E10 cpu, 2.2499969127E10 io, 
> 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 11736
> 02-01            SelectionVectorRemover : rowType = RecordType(ANY 
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = 
> {1.9151976940441746E10 rows, 9.0640232838025E10 cpu, 2.2499969127E10 io, 
> 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 11735
> 02-02              Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY 
> o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = 
> {1.9148226940441746E10 rows, 9.0636482838025E10 cpu, 2.2499969127E10 io, 
> 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 11734
> 02-03                HashAgg(group=[{0}], order_count=[$SUM0($1)]) : rowType 
> = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, 
> cumulative cost = {1.9144476940441746E10 rows, 9.030890595055101E10 cpu, 
> 2.2499969127E10 io, 3.56351968386048E12 network, 2.2571985057468002E10 
> memory}, id = 11733
> 02-04                  HashToRandomExchange(dist0=[[$0]]) : rowType = 
> RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, 
> cumulative cost = {1.9106976940441746E10 rows, 8.955890595055101E10 cpu, 
> 2.2499969127E10 io, 3.56351968386048E12 network, 2.1911985057468002E10 
> memory}, id = 11732
> 03-01                    HashAgg(group=[{0}], order_count=[COUNT()]) : 
> rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 
> 3.75E7, cumulative cost = {1.9069476940441746E10 rows, 8.895890595055101E10 
> cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 2.1911985057468002E10 
> memory}, id = 11731
> 03-02                      Project(o_orderpriority=[$1]) : rowType = 
> RecordType(ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = 
> {1.8694476940441746E10 rows, 8.145890595055101E10 cpu, 2.2499969127E10 io, 
> 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 11730
> 03-03                        Project(o_orderkey=[$1], o_orderpriority=[$2], 
> l_orderkey=[$0]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority, 
> ANY l_orderkey): rowcount = 3.75E8, cumulative cost = {1.8319476940441746E10 
> rows, 8.108390595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 
> network, 1.5311985057468002E10 memory}, id = 11729
> 03-04                          HashJoin(condition=[=($1, $0)], 
> joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_orderkey, 
> ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = 
> {1.7944476940441746E10 rows, 7.995890595055101E10 cpu, 2.2499969127E10 io, 
> 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 11728
> 03-05                            HashToRandomExchange(dist0=[[$0]]) : rowType 
> = RecordType(ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, 
> cumulative cost = {4.125E9 rows, 2.2125E10 cpu, 4.5E9 io, 3.072E12 network, 
> 0.0 memory}, id = 11727
> 05-01                              Project(o_orderkey=[$1], 
> o_orderpriority=[$2]) : rowType = RecordType(ANY o_orderkey, ANY 
> o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.75E9 rows, 
> 1.6125E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 11726
> 05-02                                SelectionVectorRemover : rowType = 
> RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 
> 3.75E8, cumulative cost = {3.375E9 rows, 1.5375E10 cpu, 4.5E9 io, 0.0 
> network, 0.0 memory}, id = 11725
> 05-03                                  Filter(condition=[AND(>=($0, 
> 1996-10-01), <($0, 1997-01-01 00:00:00))]) : rowType = RecordType(ANY 
> o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, 
> cumulative cost = {3.0E9 rows, 1.5E10 cpu, 4.5E9 io, 0.0 network, 0.0 
> memory}, id = 11724
> 05-04                                    Scan(table=[[dfs, tpchpar1000_micro, 
> orders]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=maprfs:///tpchParquet10/SF1000/orders]], 
> selectionRoot=maprfs:/tpchParquet10/SF1000/orders, numFiles=1, 
> numRowGroups=750, usedMetadataFile=false, columns=[`o_orderdate`, 
> `o_orderkey`, `o_orderpriority`]]]) : rowType = RecordType(ANY o_orderdate, 
> ANY o_orderkey, ANY o_orderpriority): rowcount = 1.5E9, cumulative cost = 
> {1.5E9 rows, 4.5E9 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 11723
> 03-06                            HashAgg(group=[{0}]) : rowType = 
> RecordType(ANY l_orderkey): rowcount = 4499992.28175, cumulative cost = 
> {1.3439976948159996E10 rows, 5.477990604317001E10 cpu, 1.7999969127E10 io, 
> 1.8431968386048E11 network, 8.711985057468E9 memory}, id = 11722
> 03-07                              HashToRandomExchange(dist0=[[$0]]) : 
> rowType = RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative 
> cost = {1.3394977025342497E10 rows, 5.441990666063001E10 cpu, 1.7999969127E10 
> io, 1.8431968386048E11 network, 7.919986415880001E9 memory}, id = 11721
> 04-01                                HashAgg(group=[{0}]) : rowType = 
> RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative cost = 
> {1.3349977102524998E10 rows, 5.369990789555001E10 cpu, 1.7999969127E10 io, 
> 0.0 network, 7.919986415880001E9 memory}, id = 11720
> 04-02                                  Project(l_orderkey=[$0], i=[true]) : 
> rowType = RecordType(ANY l_orderkey, BOOLEAN i): rowcount = 4.49999228175E8, 
> cumulative cost = {1.2899977874349998E10 rows, 5.009991407015001E10 cpu, 
> 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 11719
> 04-03                                    SelectionVectorRemover : rowType = 
> RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 
> 4.49999228175E8, cumulative cost = {1.2449978646175E10 rows, 
> 4.784991792927501E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 
> 11718
> 04-04                                      Filter(condition=[AND(=($0, $0), 
> <($1, $2))]) : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY 
> l_receiptdate): rowcount = 4.49999228175E8, cumulative cost = 
> {1.1999979418E10 rows, 4.7399918701100006E10 cpu, 1.7999969127E10 io, 0.0 
> network, 0.0 memory}, id = 11717
> 04-05                                        Scan(table=[[dfs, 
> tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], 
> selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, 
> numRowGroups=3250, usedMetadataFile=false, columns=[`l_orderkey`, 
> `l_commitdate`, `l_receiptdate`]]]) : rowType = RecordType(ANY l_orderkey, 
> ANY l_commitdate, ANY l_receiptdate): rowcount = 5.999989709E9, cumulative 
> cost = {5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0 
> network, 0.0 memory}, id = 11716
> {noformat}
> I have attached two profiles. 235a471b-aa97-bfb5-207d-3f25b4b5fbbb is from 
> commit id 4627973bde9847a4eb2672c44941136c167326a1. This does not have 
> Statistics code and serves as the baseline. It is the commit prior to the 
> Statistics commit. 235a3ed4-e3d1-f3b7-39c5-fc947f56b6d5 is from commit id 
> 5c436dbb028b813e80b7b8fcf045af31f0bcf68b. This has the Statistics code with 
> the fix for disabled Statistics.
> I also pulled the logs from the foreman to show the memory limit for Hash 
> Agg.  With the baseline, HashAgg has a limit of 10GB:
> {noformat}
> 2019-04-04 00:57:27,446 [235a471b-aa97-bfb5-207d-3f25b4b5fbbb:frag:4:142] 
> TRACE o.a.d.e.p.i.aggregate.HashAggregator - 1st phase. Estimated internal 
> row width: 8 Values row width: 0 batch size: 720896  memory limit: 
> 10000000000  max column width: 8
> {noformat}
> With statistics disabled, HashAgg has a limit of 64MB:
> {noformat}
> 2019-04-04 01:32:48,132 [235a3ed4-e3d1-f3b7-39c5-fc947f56b6d5:frag:4:182] 
> TRACE o.a.d.e.p.i.aggregate.HashAggregator - 1st phase. Estimated internal 
> row width: 8 Values row width: 0 batch size: 720896  memory limit: 65075262  
> max column width: 8
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-7154) TPCH query 4, 17 and 18 take longer with sf 1000 when Statistics are disabled

Reply via email to