Kunal Khatua created DRILL-1762: ----------------------------------- Summary: Apply different levels of parallelization to different phases of the query Key: DRILL-1762 URL: https://issues.apache.org/jira/browse/DRILL-1762 Project: Apache Drill Issue Type: Improvement Components: Query Planning & Optimization Affects Versions: 0.6.0 Environment: 10-node Drill cluster with TPCH schema (SF100) OS: RHEL 6.4 cores/node: 32 RAM: 256GB Reporter: Kunal Khatua Fix For: 0.7.0, Future
When running TPCH queries, we found that setting the planner.max.width_per_node to the maximum number of cores available didn't necessarily improve their runtimes. In almost all queries, setting the property to as low as 12 (on eacch 32-core node), resulted in faster runtimes... which is counter-intuitive. It is highly possible that the leaf fragments have a larger overhead than many other fragments due to I/O operations, and the high level of parallelization might be causing the entire phase of data scan to be sub-optimal. Based on further investigation, qe might want to split this property in two... one for leaf fragments (SCANs) and one for the rest of the class of fragments. e.g TPCH 08 (SF100) MaxWidth Runtime (msec) 12 34,650 16 35,216 20 39,482 24 41,694 28 46,579 32 57,207 -- This message was sent by Atlassian JIRA (v6.3.4#6332)