[ 
https://issues.apache.org/jira/browse/IMPALA-13660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith updated IMPALA-13660:
-----------------------------------
    Description: 
Enable intermediate result caching for broadcast hash joins. The cache key must 
incorporate probe and build sides; since the build side will be part of the 
cache key, we can rely on the broadcast results being stable.

This differs from a partition join, where we don't have the same guarantees; 
partition joins would also incorporate an exchange on the probe side, so our 
normal rules would preclude caching it right now.

The build side comes from another fragment, so we need to tolerate an exchange 
on the build side, similar to what was done in IMPALA-13185. We need to make 
sure cached results are returned immediately when available rather than waiting 
on a build side we don't need; also make sure the build side can complete even 
if its results are not needed by any joins (because they all had cache hits). 
It would be nice to cancel the build side fragment if a cache hit is identified 
on all join fragments, since the build side results would no longer be needed.

  was:
Enable intermediate result caching for broadcast hash joins. The cache key must 
incorporate probe and build sides; since the build side will be part of the 
cache key, we can rely on the broadcast results being stable.

This differs from a partition join, where we don't have the same guarantees; 
partition joins would also incorporate an exchange on the probe side, so our 
normal rules would preclude caching it right now.

The build side comes from another fragment, so we need to tolerate an exchange 
on the build side, similar to what was done in IMPALA-13185. We should also 
cancel the build side fragment if a cache hit is identified on all join 
fragments, since the build side results would no longer be needed.


> Cache broadcast hash joins
> --------------------------
>
>                 Key: IMPALA-13660
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13660
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Michael Smith
>            Priority: Major
>
> Enable intermediate result caching for broadcast hash joins. The cache key 
> must incorporate probe and build sides; since the build side will be part of 
> the cache key, we can rely on the broadcast results being stable.
> This differs from a partition join, where we don't have the same guarantees; 
> partition joins would also incorporate an exchange on the probe side, so our 
> normal rules would preclude caching it right now.
> The build side comes from another fragment, so we need to tolerate an 
> exchange on the build side, similar to what was done in IMPALA-13185. We need 
> to make sure cached results are returned immediately when available rather 
> than waiting on a build side we don't need; also make sure the build side can 
> complete even if its results are not needed by any joins (because they all 
> had cache hits). It would be nice to cancel the build side fragment if a 
> cache hit is identified on all join fragments, since the build side results 
> would no longer be needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to