metesynnada opened a new pull request, #7366: URL: https://github.com/apache/arrow-datafusion/pull/7366
## Which issue does this PR close? Continue on https://github.com/apache/arrow-datafusion/pull/6679. ## Rationale for this change The current implementation of the `JoinHashMap` and `SymmetricJoinHashMap` types could benefit from being more generic and flexible. Specifically, the ability to support different types of list data structures for chaining, as well as handling resizing in a more idiomatic and efficient manner, would be advantageous. This PR introduces the `JoinHashMapType` trait and implements it for both `JoinHashMap` and `PruningJoinHashMap`, which allows for more code reuse and a clearer separation of concerns. In this PR, Several unused hash join utilities are removed. Also, we can introduce a vectorized implementation of `SymmetricHashJoin` that includes hash collision checks. ## What changes are included in this PR? - Added a `JoinHashMapType` trait with methods for handling the mutable map and mutable list, as well as a method `as_any_mut` for dynamic downcasting. - Implemented the `JoinHashMapType` trait for both `JoinHashMap` and `PruningJoinHashMap`. - Updated the `update_hash` function to use the `JoinHashMapType` trait and only resize the list in the case of `PruningJoinHashMap`. - Updated the `build_equal_condition_join_indices` function to use the JoinHashMapType trait and introduced an offset parameter. - Added inline comments and docstrings for better readability and documentation of the code. ## Are these changes tested? Yes, the changes are covered by the existing tests. No new tests were required as the new implementation preserves the existing functionality. All tests passed successfully after the changes were applied. ## Are there any user-facing changes? No, the changes made in this PR are internal and do not affect the public API or the functionality of the crate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
