Robert Hou created DRILL-7121:
---------------------------------

             Summary: TPCH 4 takes longer
                 Key: DRILL-7121
                 URL: https://issues.apache.org/jira/browse/DRILL-7121
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
    Affects Versions: 1.16.0
            Reporter: Robert Hou
            Assignee: Gautam Parai
             Fix For: 1.16.0


Here is TPCH 4 with sf 100:
{noformat}
select
  o.o_orderpriority,
  count(*) as order_count
from
  orders o

where
  o.o_orderdate >= date '1996-10-01'
  and o.o_orderdate < date '1996-10-01' + interval '3' month
  and 
  exists (
    select
      *
    from
      lineitem l
    where
      l.l_orderkey = o.o_orderkey
      and l.l_commitdate < l.l_receiptdate
  )
group by
  o.o_orderpriority
order by
  o.o_orderpriority;
{noformat}

The plan has changed when Statistics is disabled.   A Hash Agg and a Broadcast 
Exchange have been added.  These two operators expand the number of rows from 
the lineitem table from 137M to 9B rows.   This forces the hash join to use 6GB 
of memory instead of 30 MB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to