Kunal Khatua created DRILL-1762:
-----------------------------------

             Summary: Apply different levels of parallelization to different 
phases of the query 
                 Key: DRILL-1762
                 URL: https://issues.apache.org/jira/browse/DRILL-1762
             Project: Apache Drill
          Issue Type: Improvement
          Components: Query Planning & Optimization
    Affects Versions: 0.6.0
         Environment: 10-node Drill cluster with TPCH schema (SF100)
OS: RHEL 6.4
cores/node: 32
RAM: 256GB
            Reporter: Kunal Khatua
             Fix For: 0.7.0, Future


When running TPCH queries, we found that setting the planner.max.width_per_node 
to the maximum number of cores available didn't necessarily improve their 
runtimes. In almost all queries, setting the property to as low as 12 (on eacch 
32-core node), resulted in faster runtimes... which is counter-intuitive.

It is highly possible that the leaf fragments have a larger overhead than many 
other fragments due to I/O operations, and the high level of parallelization 
might be causing the entire phase of data scan to be sub-optimal. Based on 
further investigation, qe might want to split this property  in two... one for 
leaf fragments (SCANs) and one for the rest of the class of fragments.
e.g TPCH 08 (SF100)
MaxWidth        Runtime (msec)
12              34,650
16              35,216
20              39,482
24              41,694
28              46,579
32              57,207




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to