korowa opened a new issue, #8130: URL: https://github.com/apache/arrow-datafusion/issues/8130
### Is your feature request related to a problem or challenge? The problem related to #8020 At this moment, [collect_left_input](https://github.com/apache/arrow-datafusion/blob/91c9d6f847eda0b5b1d01257b5c24459651d3926/datafusion/physical-plan/src/joins/hash_join.rs#L612) function in hash join implementation produces `JoinHasMap` which is optimized for reverse iteration -- for each `hash_value` the actual HashMap stores the index of the last row from the build-side, and remaining indices for same has are stored in `JoinHashMap.next` vector. To keep maintaining inputs order in join output, `HashJoinStream` iterates over probe-batch in [reverse](https://github.com/apache/arrow-datafusion/blob/91c9d6f847eda0b5b1d01257b5c24459651d3926/datafusion/physical-plan/src/joins/hash_join.rs#L906) order, and after processing the whole batch, it, again, reverts order of matched [build](https://github.com/apache/arrow-datafusion/blob/91c9d6f847eda0b5b1d01257b5c24459651d3926/datafusion/physical-plan/src/joins/hash_join.rs#L940) and [probe](https://github.com/apache/arrow-datafusion/blob/91c9d6f847eda0b5b1d01257b5c24459651d3926/datafusion/physical-plan/src/joins/hash_join.rs#L941) indices. The problem is that reverse iteration doesn't allow `HashJoinStream` to produce partial output batch (without having all matched indices for current probe batch) -- in this case join output order will be distorted. ### Describe the solution you'd like Desired behaviour of `HashJoinStream` -- preserving natural order of probe + build side record, without intermediary inversion of indices -- to be ready to produce output batch in any moment. This could be achieved by modifying `update_hash` to save "head" of hash chains in Map. It may also require to still track chain tails while building `JoinHashMap` in order to keep performance at the same level -- tails might be stored in `JoinHashMap` (probably not the best decision in terms of memory utilization), or in a separate data structure which will only exist during build-side collection and won't be as memory-greedy as one more integer in HashMap tuples. ### Describe alternatives you've considered _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
