mrhhsg opened a new pull request, #64108:
URL: https://github.com/apache/doris/pull/64108
### What problem does this PR solve?
Issue Number: None
Related PR: #63767
Problem Summary:
This is a branch-3.1 cherry-pick of #63767.
Correlated NOT IN subqueries under disjunction can be rewritten to a mark
null-aware left anti join with additional join conjuncts. On branch-3.1, when
the probe join key is NULL, the hash table lookup advanced the probe index
before the caller could run the null-probe handling path. As a result, the
probe row could be skipped before the mark column was evaluated by the outer
disjunction, producing incomplete query results.
This change keeps the probe index on the NULL row so the null-aware join
path can emit the correct mark value. The branch-3.1 implementation encodes the
NULL probe key as `build_idx_map[probe_idx] == bucket_size`, so the cherry-pick
was adapted to preserve that probe row instead of advancing `probe_idx`.
### Release note
Fix incorrect results for correlated NOT IN subqueries combined with
disjunctions.
### Check List (For Author)
- Test:
- [x] Regression test
- Added/updated `correctness/test_subquery_in_disjunction` cases and
expected output from #63767.
- [x] Manual test (add detailed scripts or steps below)
- Ran `./build-support/clang-format.sh
be/src/vec/common/hash_table/join_hash_table.h` (passed)
- Ran `./build-support/check-format.sh` (passed)
- Ran `git diff --check HEAD~1..HEAD` (passed)
- Attempted `DORIS_HOME=$PWD ninja -C be/ut_build_ASAN
src/exec/CMakeFiles/Exec.dir/operator/join/null_aware_left_anti_join_impl.cpp.o
src/exec/CMakeFiles/Exec.dir/operator/hashjoin_probe_operator.cpp.o
src/exec/CMakeFiles/Exec.dir/operator/hashjoin_build_sink.cpp.o`, but local
CMake regeneration failed because the current local thirdparty/CMake
environment cannot resolve the existing target
`absl::random_internal_pool_urbg`.
- Attempted `./build.sh --be`, but it was blocked by the same
pre-existing local CMake/thirdparty target issue: `Target "doris_be" links to
absl::random_internal_pool_urbg but the target was not found`.
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason
- Behavior changed:
- [x] Yes. Corrects query result semantics for affected null-aware mark
anti joins.
- [ ] No.
- Does this need documentation?
- [x] No.
- [ ] Yes.
### Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]