Phoenix500526 opened a new pull request, #23023:
URL: https://github.com/apache/datafusion/pull/23023
## Which issue does this PR close?
- Closes #23008.
## Rationale for this change
`HashJoinStream::state_after_build_ready` distinguishes two facts about the
collected build side: whether it physically has rows, and whether its hash
map
has any matchable entries. These differ under
`NullEquality::NullEqualsNothing`,
where build rows with a NULL join key are omitted from the map — the
distinction that was the source of the bug fixed in #22893.
Today that distinction is expressed inline as raw `batch().num_rows() == 0` /
`map().is_empty()` checks. Naming it at the API surface makes the invariant
explicit and harder to misuse at future call sites.
## What changes are included in this PR?
- Add two helper methods on `JoinLeftData`:
- `has_build_rows()` — original build-side row presence
- `has_matchable_build_rows()` — matchable hash-map row presence
- Update `state_after_build_ready` to use them instead of the raw
`batch().num_rows()` / `map().is_empty()` checks.
No behavior change.
## Are these changes tested?
Covered by the existing hash join tests (`cargo test -p
datafusion-physical-plan hash_join`,
804 passing). This is a readability-only refactor with no behavior change,
so no
new tests are added.
## Are there any user-facing changes?
No.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]