One quick note here, I don't think partitioning LINEORDER table on
LO_ORDERDATE would help this query. If you look at the query profile you
will see that Drill is reading everything from LINEORDER.

On Fri, Mar 18, 2016 at 7:57 AM, Dmitry Krivov <[email protected]> wrote:

> Just for info :
>
> After recreating tables with explicit columns CASTing  have double
> performace of this query (from  60 to 35 sec.)
>
> Best regards,
> Dmitry
>
> > Hello
> >
> > I have load (as CTAS) into parquet-files StarShema Benchmark generated
> > csv-data (scale factor 50)
> >
> > For one of bencmark query's like :
> >
> > select
> > d.d_year,
> > c.c_region,
> > sum(l.lo_extendedprice*l.lo_discount) as revenue
> > from dfs.tpch.lineorder_part l,
> >        dfs.tpch.dates d,
> >        dfs.tpch.customer c
> > where l.lo_orderdate = d.d_datekey
> >          and l.lo_custkey = c.c_custkey
> >          and d.d_year=1995
> > group by d.d_year, c.c_region
> > order by d.d_year desc, c.c_region asc;
> >
> > got min. exec time of 59 sec.
> >
> > Table LINEORDER have 300M rows and partitioned by LO_ORDERDATE column
> (2406
> > partitions in related parquet-files)
> > Table CUSTOMER have 1.5M rows and table DATES have 2556 rows, both tables
> > not partitioned
> >
> > Drill 1.5 conf. have :
> >
> > drill-env.sh :
> > DRILL_MAX_DIRECT_MEMORY="16G"
> > DRILL_HEAP="8G"
> >
> > sys.options changed  :
> >
> > planner.memory.max_query_memory_per_node = 8 000 000 000
> > planner.memory_limit = 1 000 000 000
> > planner.width.max_per_node = 16 (was 12 by default)
> >
> > Drill is installed on 16VCPU Linux VM and under query runtime all
> 16VCPU's
> > 100% utilized.
> >
> > Is there any chance to improve this query exectime  (my be with some
> > additional sys.options changes) ?
> >
> > Thank's!
> >
> > P.S. Just two days as starting to learn and test Apache Drill
> >
> > Best regards,
> > Dmitry
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Reply via email to