Re: Performance tuning for TPC-H Q1 on a three nodes cluster

Dechang Gu Tue, 24 May 2016 17:33:03 -0700

Hi Yijie,
Thanks for the profile. Looks like from the Operator Profile overview, 03-xx-02 
HASH_AGGREGATE and 03-xx-06 PARQUET_ROW_GROUP_SCAN took the most of time:


03-xx-02        HASH_AGGREGATE  0.020s  0.083s  0.213s  1m06s   1m55s   3m12s   
0.000s  0.000s  0.000s  16MB    16MB
03-xx-03        PROJECT 0.004s  0.009s  0.024s  6.539s  11.249s 16.504s 0.000s  
0.000s  0.000s  536KB   536KB
03-xx-04        SELECTION_VECTOR_REMOVER        0.002s  0.004s  0.014s  34.456s 
58.595s 1m26s   0.000s  0.000s  0.000s  664KB   664KB
03-xx-05        FILTER  0.001s  0.003s  0.009s  2.556s  4.355s  6.543s  0.000s  
0.000s  0.000s  8MB     8MB
03-xx-06        PARQUET_ROW_GROUP_SCAN  0.000s  0.000s  0.000s  2m13s   3m40s   
5m14s   5.392s  6.372s  7.371s  12MB    12MB
For the SCAN, there are some skewness among the batches/records processed by 
the minor fragments:  from ~5000 batches to +15000 batches  The 5m process 
times are for the bigger batches. 

Similar skewness is also shown in 03-xx-02.   If the load can be balanced among 
the fragments, it may speed up the query a little bit.

On the other hand,  the 10min query time is not that much off the line, 
considering that you have only 3 nodes and each node is with only 7 HDDs. I did 
a similar run of the query on my
cluster (10 nodes each with 32 cores and 23 HDDs),  For 03-xx-02 and 03-xx-06, 
the process time per record is about the same as what you got, even though the 
total query time is much short due to it is on much bigger system.

Another note: for your test, disk IO is about ~24MB/s/HDD (150GB/300s/21HDD), 
which is about the IO limit if there are a lot small IOs (what is the chuck 
size of your HDFS?)

Hope this helps, let me know If you have any further questions.

Thanks,
Dechang




On Mon, May 23, 2016 at 7:28 PM, Yijie Shen <[email protected] 
<mailto:[email protected]>> wrote:
Hi Dechang, Thanks for your reply.

I've changed the parallelism of each node from 17 to 24 but no performance 
improvements seen.

The 1516 files are distributed in HDFS constitute 3 datanodes, same 3 servers 
where I start DrillBits. For each datanode, 7 HDD is configured as 
dfs.datanode.data.dir.
The data was generated from TPCH dbgen tool and loaded into parquet format 
using Apache Spark, the file size are 101MB on average, no skew seen.

I've paste the `Full json profile` content and attached to this mail.

Thank you.

On Mon, May 23, 2016 at 11:12 PM, Dechang Gu <[email protected] 
<mailto:[email protected]>> wrote:
Hi Yijie,
This is Dechang at MapR. I work on Drill performance. 

From what you described, looks like scan took most of the time.
How are the files are distributed on the disks, are there any skew?
How many disks are there?
If possible can you provide the profile for the run?

Thanks,
Dechang

On Sun, May 22, 2016 at 9:06 AM, Yijie Shen <[email protected] 
<mailto:[email protected]>> wrote:
Hi all,

I'm trying out Drill on master branch lately and have deployed a cluster on
three physical server.

The input data `lineitem` is in parquet format of total size 150GB, 101MB
per file and 1516 files in total.

The server has two Intel(R) Xeon(R) CPU E5645 @2.40GHz CPUs and 24 cores in
total, 32GB memory.

While executing Q1 using:

 SELECT
  L_RETURNFLAG, L_LINESTATUS, SUM(L_QUANTITY), SUM(L_EXTENDEDPRICE),
SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)),
SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)*(1+L_TAX)), AVG(L_QUANTITY),
AVG(L_EXTENDEDPRICE), AVG(L_DISCOUNT), COUNT(1)
FROM
  dfs.tpch.`lineitem`
WHERE
  L_SHIPDATE<='1998-09-02'
GROUP BY L_RETURNFLAG, L_LINESTATUS
ORDER BY L_RETURNFLAG, L_LINESTATUS

I've noticed the parallelism was 51 (planner.width.max_per_node = 17) in my
case for Major Fragment 03 (Scan Filter Project HashAgg and Project), and
each Minor fragment last about 8 to 9 minutes. one record for example:

03-00-xx hw080 7.309s 42.358s 9m35s 118,758,489 14,540 22:31:32 22:31:32
33MB FINISHED

Is this a normal speed (more than 10 minutes) for Drill for my current
cluster? Did I miss something important in conf to accelerate the execution?

Thanks very much!

Yijie

Re: Performance tuning for TPC-H Q1 on a three nodes cluster

Reply via email to