metesynnada opened a new pull request, #8234: URL: https://github.com/apache/arrow-datafusion/pull/8234
## Which issue does this PR close? Closes #. ## Rationale for this change In our ongoing efforts to increase join use cases in our Datafusion, this PR introduces a trait: `EagerJoinStream`. This trait is designed to provide a more structured and efficient way to implement more join use cases in future, ensuring better maintainability and ease of use. - `EagerJoinStream`: This trait ensures that all join operations are evaluated eagerly, providing faster response times for scenarios where immediate results are required. ## What changes are included in this PR? This PR includes the implementation of the `EagerJoinStream` trait, along with necessary modifications to existing code to integrate these new traits. The changes are as follows: 1. **Implementation of `EagerJoinStream`**: This trait is implemented to support eager evaluation of join operations. 2. **Code restructure**: `stream_hash_utils.rs` introduced to separate responsibility on HJ and SHJ. It becomes easier to maintain. 3. **Integration and Refactoring**: Existing code has been refactored and integrated with these new traits to ensure seamless operation and maintain compatibility. 4. **Proto support for SHJ**: `datafusion.proto` file and ser/de features are updated to support SHJ in proto. Changes are mostly code restructuring and proto implementations, instead of adding new functionality to the joins. ## Are these changes tested? Yes, comprehensive tests have been added to cover the new functionality introduced by these traits. The tests ensure the new features' correctness, performance, and reliability. ## Are there any user-facing changes? NA -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
