korowa opened a new issue, #8130:
URL: https://github.com/apache/arrow-datafusion/issues/8130

   ### Is your feature request related to a problem or challenge?
   
   The problem related to #8020
   
   At this moment, 
[collect_left_input](https://github.com/apache/arrow-datafusion/blob/91c9d6f847eda0b5b1d01257b5c24459651d3926/datafusion/physical-plan/src/joins/hash_join.rs#L612)
 function in hash join implementation produces `JoinHasMap` which is optimized 
for reverse iteration -- for each `hash_value` the actual HashMap stores the 
index of the last row from the build-side, and remaining indices for same has 
are stored in `JoinHashMap.next` vector.
   
   To keep maintaining inputs order in join output, `HashJoinStream` iterates 
over probe-batch in 
[reverse](https://github.com/apache/arrow-datafusion/blob/91c9d6f847eda0b5b1d01257b5c24459651d3926/datafusion/physical-plan/src/joins/hash_join.rs#L906)
 order, and after processing the whole batch, it, again, reverts order of 
matched 
[build](https://github.com/apache/arrow-datafusion/blob/91c9d6f847eda0b5b1d01257b5c24459651d3926/datafusion/physical-plan/src/joins/hash_join.rs#L940)
 and 
[probe](https://github.com/apache/arrow-datafusion/blob/91c9d6f847eda0b5b1d01257b5c24459651d3926/datafusion/physical-plan/src/joins/hash_join.rs#L941)
 indices.
   
   The problem is that reverse iteration doesn't allow `HashJoinStream` to 
produce partial output batch (without having all matched indices for current 
probe batch) -- in this case join output order will be distorted.
   
   ### Describe the solution you'd like
   
   Desired behaviour of `HashJoinStream` -- preserving natural order of probe + 
build side record, without intermediary inversion of indices -- to be ready to 
produce output batch in any moment.
   
   This could be achieved by modifying `update_hash` to save "head" of hash 
chains in Map. It may also require to still track chain tails while building 
`JoinHashMap` in order to keep performance at the same level --  tails might be 
stored in `JoinHashMap` (probably not the best decision in terms of memory 
utilization), or in a separate data structure which will only exist during 
build-side collection and won't be as memory-greedy as one more integer in 
HashMap tuples. 
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to