Hello Andrew Sherman, Kurt Deschler, Abhishek Rawat, Wenzhe Zhou, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19656

to look at the new patch set (#5).

Change subject: IMPALA-12029: Relax scan fragment parallelism on first planning
......................................................................

IMPALA-12029: Relax scan fragment parallelism on first planning

In a setup with multiple executor group set, Frontend will try to match
a query with the smallest executor group set that can fit the memory and
cpu requirement of the compiled query. There are kinds of query where
the compiled plan will fit to any executor group set but not necessarily
deliver the best performance. An example for this is Impala's COMPUTE
STATS query. It does full table scan and aggregate the stats, have
fairly simple query plan shape, but can benefit from higher scan
parallelism.

This patch relaxes the scan fragment parallelism on first round of query
planning. This allows scan fragment to increase its parallelism based on
its ProcessingCost estimation. If the relaxed plan fit in an executor
group set, we replan once again with that executor group set but with
scan fragment parallelism returned back to MT_DOP. This one extra round
of query planning adds couple millisecond overhead depending on the
complexity of the query plan, but necessary since the backend scheduler
still expect at most MT_DOP amount of scan fragment instances. We can
remove the extra replanning in the future once we can fully manage scan
node parallelism without MT_DOP.

This patch also tune computeScanProcessingCost() to guard against
scheduling too many scan fragments by comparing with the actual scan
range count that Planner knows.

Testing:
- Pass test_executor_groups.py
- Add test case in test_min_processing_per_thread_small.
- Raised impala.admission-control.max-query-mem-limit.root.small from
  64MB to 70MB in llama-site-3-groups.xml so that the new grouping query
  can fit in root.small pool.

Change-Id: I7a2276fbd344d00caa67103026661a3644b9a1f9
---
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/resources/llama-site-3-groups.xml
M tests/custom_cluster/test_executor_groups.py
10 files changed, 185 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/19656/5
--
To view, visit http://gerrit.cloudera.org:8080/19656
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7a2276fbd344d00caa67103026661a3644b9a1f9
Gerrit-Change-Number: 19656
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <asher...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

Reply via email to