metesynnada opened a new pull request, #7366:
URL: https://github.com/apache/arrow-datafusion/pull/7366

   ## Which issue does this PR close?
   
   Continue on https://github.com/apache/arrow-datafusion/pull/6679.
   
   ## Rationale for this change
   
   The current implementation of the `JoinHashMap` and `SymmetricJoinHashMap` 
types could benefit from being more generic and flexible. Specifically, the 
ability to support different types of list data structures for chaining, as 
well as handling resizing in a more idiomatic and efficient manner, would be 
advantageous. This PR introduces the `JoinHashMapType` trait and implements it 
for both `JoinHashMap` and `PruningJoinHashMap`, which allows for more code 
reuse and a clearer separation of concerns. 
   
   In this PR, Several unused hash join utilities are removed. Also, we can 
introduce a vectorized implementation of `SymmetricHashJoin` that includes hash 
collision checks.
   
   ## What changes are included in this PR?
   
   - Added a `JoinHashMapType` trait with methods for handling the mutable map 
and mutable list, as well as a method `as_any_mut` for dynamic downcasting.
   - Implemented the `JoinHashMapType` trait for both `JoinHashMap` and 
`PruningJoinHashMap`.
   - Updated the `update_hash` function to use the `JoinHashMapType` trait and 
only resize the list in the case of `PruningJoinHashMap`.
   - Updated the `build_equal_condition_join_indices` function to use the 
JoinHashMapType trait and introduced an offset parameter.
   - Added inline comments and docstrings for better readability and 
documentation of the code.
   
   ## Are these changes tested?
   
   Yes, the changes are covered by the existing tests. No new tests were 
required as the new implementation preserves the existing functionality. All 
tests passed successfully after the changes were applied.
   
   ## Are there any user-facing changes?
   
   No, the changes made in this PR are internal and do not affect the public 
API or the functionality of the crate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to