[
https://issues.apache.org/jira/browse/IMPALA-12029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709128#comment-17709128
]
ASF subversion and git services commented on IMPALA-12029:
----------------------------------------------------------
Commit b2bc488402462371c13650bef8385c31792e5919 in impala's branch
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b2bc48840 ]
IMPALA-12029: Relax scan fragment parallelism on first planning
In a setup with multiple executor group set, Frontend will try to match
a query with the smallest executor group set that can fit the memory and
cpu requirement of the compiled query. There are kinds of query where
the compiled plan will fit to any executor group set but not necessarily
deliver the best performance. An example for this is Impala's COMPUTE
STATS query. It does full table scan and aggregate the stats, have
fairly simple query plan shape, but can benefit from higher scan
parallelism.
This patch relaxes the scan fragment parallelism on first round of query
planning. This allows scan fragment to increase its parallelism based on
its ProcessingCost estimation. If the relaxed plan fit in an executor
group set, we replan once again with that executor group set but with
scan fragment parallelism returned back to MT_DOP. This one extra round
of query planning adds couple millisecond overhead depending on the
complexity of the query plan, but necessary since the backend scheduler
still expect at most MT_DOP amount of scan fragment instances. We can
remove the extra replanning in the future once we can fully manage scan
node parallelism without MT_DOP.
This patch also adds some improvement, including:
- Tune computeScanProcessingCost() to guard against scheduling too many
scan fragments by comparing with the actual scan range count that
Planner knows.
- Use NUM_SCANNER_THREADS as a hint to cap scan node cost during the
first round of planning.
- Multiply memory related counters by num executors to make it per group
set rather than per node.
- Fix bug in doCreateExecRequest() about selection of num executors for
planning.
Testing:
- Pass test_executor_groups.py
- Add test cases in test_min_processing_per_thread_small.
- Raised impala.admission-control.max-query-mem-limit.root.small from
64MB to 70MB in llama-site-3-groups.xml so that the new grouping query
can fit in root.small pool.
Change-Id: I7a2276fbd344d00caa67103026661a3644b9a1f9
Reviewed-on: http://gerrit.cloudera.org:8080/19656
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Kurt Deschler <[email protected]>
Reviewed-by: Wenzhe Zhou <[email protected]>
> Query can be under parallelized in multi executor group set setup
> -----------------------------------------------------------------
>
> Key: IMPALA-12029
> URL: https://issues.apache.org/jira/browse/IMPALA-12029
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Affects Versions: Impala 4.2.0
> Reporter: Riza Suminto
> Assignee: Riza Suminto
> Priority: Critical
> Fix For: Impala 4.2.0
>
>
> In multiple executor group set setup, Frontend will try to match a query with
> the smallest executor group set that can fit the memory and cpu requirement
> of the compiled query. There are kind of query where the compiled plan will
> fit to any executor group set but not necessarily deliver the best
> performance. An example for this is Impala's COMPUTE STATS query. It does
> full table scan and aggregate the stats, have fairly simple query plan shape,
> but can benefit from higher scan parallelism.
> Planner needs to give additional feedback to Frontend that the query might be
> under parallelized under current executor group. Frontend can then make
> judgement whether to assign the compiled plan to current executor group
> anyway, or try step up to the next larger executor group and increase
> parallelism.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]