One quick note here, I don't think partitioning LINEORDER table on LO_ORDERDATE would help this query. If you look at the query profile you will see that Drill is reading everything from LINEORDER.
On Fri, Mar 18, 2016 at 7:57 AM, Dmitry Krivov <[email protected]> wrote: > Just for info : > > After recreating tables with explicit columns CASTing have double > performace of this query (from 60 to 35 sec.) > > Best regards, > Dmitry > > > Hello > > > > I have load (as CTAS) into parquet-files StarShema Benchmark generated > > csv-data (scale factor 50) > > > > For one of bencmark query's like : > > > > select > > d.d_year, > > c.c_region, > > sum(l.lo_extendedprice*l.lo_discount) as revenue > > from dfs.tpch.lineorder_part l, > > dfs.tpch.dates d, > > dfs.tpch.customer c > > where l.lo_orderdate = d.d_datekey > > and l.lo_custkey = c.c_custkey > > and d.d_year=1995 > > group by d.d_year, c.c_region > > order by d.d_year desc, c.c_region asc; > > > > got min. exec time of 59 sec. > > > > Table LINEORDER have 300M rows and partitioned by LO_ORDERDATE column > (2406 > > partitions in related parquet-files) > > Table CUSTOMER have 1.5M rows and table DATES have 2556 rows, both tables > > not partitioned > > > > Drill 1.5 conf. have : > > > > drill-env.sh : > > DRILL_MAX_DIRECT_MEMORY="16G" > > DRILL_HEAP="8G" > > > > sys.options changed : > > > > planner.memory.max_query_memory_per_node = 8 000 000 000 > > planner.memory_limit = 1 000 000 000 > > planner.width.max_per_node = 16 (was 12 by default) > > > > Drill is installed on 16VCPU Linux VM and under query runtime all > 16VCPU's > > 100% utilized. > > > > Is there any chance to improve this query exectime (my be with some > > additional sys.options changes) ? > > > > Thank's! > > > > P.S. Just two days as starting to learn and test Apache Drill > > > > Best regards, > > Dmitry > > > -- Abdelhakim Deneche Software Engineer <http://www.mapr.com/> Now Available - Free Hadoop On-Demand Training <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>
