Tim Armstrong created IMPALA-9704:
-------------------------------------
Summary: Consider doing remote reads for small dimension tables
instead of scan+exchange
Key: IMPALA-9704
URL: https://issues.apache.org/jira/browse/IMPALA-9704
Project: IMPALA
Issue Type: Improvement
Components: Frontend
Reporter: Tim Armstrong
The remote data cache changes the calculus for certain broadcast join plans.
Previously we always did local reads then broadcast the output of the scan. But
with the data cache it could make sense to read small tables into the data
cache into all of the nodes and replicate the scan on all nodes without the
exchange.
There's a variety of factors in play, including cost of predicate evaluation
and runtime filter evaluations, and whether the remote data cache is enabled,
so it's not guaranteed to be a win.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]