[
https://issues.apache.org/jira/browse/IMPALA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16982185#comment-16982185
]
ASF subversion and git services commented on IMPALA-9146:
---------------------------------------------------------
Commit 0f2aa509891642e29763aa24e47f07554932c7d2 in impala's branch
refs/heads/master from Aman Sinha
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0f2aa50 ]
IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala's DistributedPlanner may sometimes accidentally choose broadcast
distribution for inputs that are larger than the destination executor's
total memory. This could potentially happen if the cluster membership is
not accurately known and the planner's cost computation of the
broadcastCost vs partitionCost happens to favor the broadcast
distribution. This causes spilling and severely affects performance.
Although the DistributedPlanner does a mem_limit check before picking
broadcast, the mem_limit is not an accurate reflection since it is
assigned during admission control.
As a safety here we introduce an explicit configurable limit:
broadcast_bytes_limit for the size of the broadcast input and set it
to default of 32GB. The default is chosen based on analysis of existing
benchmark queries and representative workloads such that in vast
majority of the cases the parameter value does not need to be changed.
If the estimated input size on the build side is greater than this
threshold, the DistributedPlanner will fall back to a partition
distribution. Setting this parameter to 0 causes it to be ignored.
Testing:
- Ran all regression tests on Jenkins successfully
- Added a few unit testis in PlannerTest that (a) set the
broadcast_bytes_limit to a small value and checks whether the
distributed plan does hash partitioning on the build side instead
of broadcast, (b) pass a broadcast hint to override the config
setting, (c) verify the standard case where broadcast threshold
is larger than the build input size.
Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9
Reviewed-on: http://gerrit.cloudera.org:8080/14690
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Limit the size of the broadcast input on build side of hash join
> ----------------------------------------------------------------
>
> Key: IMPALA-9146
> URL: https://issues.apache.org/jira/browse/IMPALA-9146
> Project: IMPALA
> Issue Type: Bug
> Affects Versions: Impala 3.3.0
> Reporter: Aman Sinha
> Assignee: Aman Sinha
> Priority: Minor
>
> Since broadcast based hash joins are often chosen, we sometimes see very
> large tables being broadcast, with sizes that are larger than the destination
> executor's total memory. This could potentially happen if the cluster
> membership is not accurately known and the planner's cost computation of the
> broadcastCost vs partitionCost happens to favor the broadcast distribution.
> This causes spilling and severely affects performance. Although the
> DistributedPlanner does a mem_limit check before picking broadcast, the
> mem_limit is not an accurate reflection since it is assigned during
> admission control (See
> [IMPALA-988|https://issues.apache.org/jira/browse/IMPALA-988]).
> Given this scenario, as a safety check it is better to have to an explicit
> configurable limit for the size of the broadcast input and set it to a
> reasonable default. The 'reasonable' default can be chosen based on analysis
> of existing benchmark queries and representative workloads where Impala is
> currently used.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]