Andy Grove created ARROW-10782: ---------------------------------- Summary: [Rust] [DataFusion] Optimize hash join to use smaller relation as build side Key: ARROW-10782 URL: https://issues.apache.org/jira/browse/ARROW-10782 Project: Apache Arrow Issue Type: Improvement Components: Rust - DataFusion Reporter: Andy Grove
When performing an inner join using the hash join algorithm, it is more efficient to load the smaller table into memory and then stream the larger table. We should the statistics made available in https://issues.apache.org/jira/browse/ARROW-10781 to build an optimizer rule to determine the smaller side of a join and use that as the build/hash side. -- This message was sent by Atlassian Jira (v8.3.4#803005)