[ 
https://issues.apache.org/jira/browse/IMPALA-13660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-13660:
-----------------------------------
    Description: 
Enable intermediate result caching for broadcast hash joins. The cache key must 
incorporate probe and build sides; since the build side will be part of the 
cache key, we can rely on the broadcast results being stable.

This differs from a partition join, where we don't have the same guarantees; 
partition joins would also incorporate an exchange on the probe side, so our 
normal rules would preclude caching it right now.

The build side comes from another fragment, so we need to tolerate an exchange 
on the build side, similar to what was done in IMPALA-13185.

We need to make sure cached results are returned immediately when available 
rather than waiting on a build side we don't need; also make sure the build 
side can complete even if its results are not needed by any joins (because they 
all had cache hits). It would be nice to cancel the build side fragment if a 
cache hit is identified on all join fragments, since the build side results 
would no longer be needed.

  was:
Enable intermediate result caching for broadcast hash joins. The cache key must 
incorporate probe and build sides; since the build side will be part of the 
cache key, we can rely on the broadcast results being stable.

This differs from a partition join, where we don't have the same guarantees; 
partition joins would also incorporate an exchange on the probe side, so our 
normal rules would preclude caching it right now.

The build side comes from another fragment, so we need to tolerate an exchange 
on the build side, similar to what was done in IMPALA-13185. We need to make 
sure cached results are returned immediately when available rather than waiting 
on a build side we don't need; also make sure the build side can complete even 
if its results are not needed by any joins (because they all had cache hits). 
It would be nice to cancel the build side fragment if a cache hit is identified 
on all join fragments, since the build side results would no longer be needed.


> Cache broadcast hash joins
> --------------------------
>
>                 Key: IMPALA-13660
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13660
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Michael Smith
>            Priority: Major
>
> Enable intermediate result caching for broadcast hash joins. The cache key 
> must incorporate probe and build sides; since the build side will be part of 
> the cache key, we can rely on the broadcast results being stable.
> This differs from a partition join, where we don't have the same guarantees; 
> partition joins would also incorporate an exchange on the probe side, so our 
> normal rules would preclude caching it right now.
> The build side comes from another fragment, so we need to tolerate an 
> exchange on the build side, similar to what was done in IMPALA-13185.
> We need to make sure cached results are returned immediately when available 
> rather than waiting on a build side we don't need; also make sure the build 
> side can complete even if its results are not needed by any joins (because 
> they all had cache hits). It would be nice to cancel the build side fragment 
> if a cache hit is identified on all join fragments, since the build side 
> results would no longer be needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to