[
https://issues.apache.org/jira/browse/IMPALA-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aman Sinha reassigned IMPALA-988:
---------------------------------
Assignee: Aman Sinha
> Join strategy (broadcast vs shuffle) decision does not take memory
> consumption and other joins into account
> -----------------------------------------------------------------------------------------------------------
>
> Key: IMPALA-988
> URL: https://issues.apache.org/jira/browse/IMPALA-988
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Affects Versions: Impala 1.2.1
> Reporter: Alan Choi
> Assignee: Aman Sinha
> Priority: Minor
> Labels: resource-management
>
> The amount of available memory changes the trade-off between partitioned and
> shuffle join strategies: if switching to shuffle join can avoid spilling to
> disk, it may be worth paying the cost of the additional network transfer.
> There are two issues:
> 1. Join strategy decision only takes query mem-limit into account but ignore
> process mem-limit.
> 2. Join strategy decision does not take other joins of the same query into
> account. When multiple joins are present, memory consumption can be very high.
> I ([[email protected]]) don't think we should attempt to fix #1 -
> there's a phase ordering problem here - we currently choose the
> best-performing plan then decide how much memory to allocate in admission
> control based on that plan. We can't preserve that while attempting to change
> the plan to fit the mem_limit. That said, I think the current heuristic is a
> little too aggressive about picking broadcast when the right side is very
> large - it should probably bias more towards shuffle as the right side gets
> larger.
> Note that when IMPALA-3200 is completed, this shouldn't prevent the query
> running to completion, but still affects performance.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]