mrhhsg opened a new pull request, #64108:
URL: https://github.com/apache/doris/pull/64108

   ### What problem does this PR solve?
   
   Issue Number: None
   
   Related PR: #63767
   
   Problem Summary:
   
   This is a branch-3.1 cherry-pick of #63767.
   
   Correlated NOT IN subqueries under disjunction can be rewritten to a mark 
null-aware left anti join with additional join conjuncts. On branch-3.1, when 
the probe join key is NULL, the hash table lookup advanced the probe index 
before the caller could run the null-probe handling path. As a result, the 
probe row could be skipped before the mark column was evaluated by the outer 
disjunction, producing incomplete query results.
   
   This change keeps the probe index on the NULL row so the null-aware join 
path can emit the correct mark value. The branch-3.1 implementation encodes the 
NULL probe key as `build_idx_map[probe_idx] == bucket_size`, so the cherry-pick 
was adapted to preserve that probe row instead of advancing `probe_idx`.
   
   ### Release note
   
   Fix incorrect results for correlated NOT IN subqueries combined with 
disjunctions.
   
   ### Check List (For Author)
   
   - Test:
       - [x] Regression test
           - Added/updated `correctness/test_subquery_in_disjunction` cases and 
expected output from #63767.
       - [x] Manual test (add detailed scripts or steps below)
           - Ran `./build-support/clang-format.sh 
be/src/vec/common/hash_table/join_hash_table.h` (passed)
           - Ran `./build-support/check-format.sh` (passed)
           - Ran `git diff --check HEAD~1..HEAD` (passed)
           - Attempted `DORIS_HOME=$PWD ninja -C be/ut_build_ASAN 
src/exec/CMakeFiles/Exec.dir/operator/join/null_aware_left_anti_join_impl.cpp.o 
src/exec/CMakeFiles/Exec.dir/operator/hashjoin_probe_operator.cpp.o 
src/exec/CMakeFiles/Exec.dir/operator/hashjoin_build_sink.cpp.o`, but local 
CMake regeneration failed because the current local thirdparty/CMake 
environment cannot resolve the existing target 
`absl::random_internal_pool_urbg`.
           - Attempted `./build.sh --be`, but it was blocked by the same 
pre-existing local CMake/thirdparty target issue: `Target "doris_be" links to 
absl::random_internal_pool_urbg but the target was not found`.
       - [ ] No need to test or manual test. Explain why:
           - [ ] This is a refactor/code format and no logic has been changed.
           - [ ] Previous test can cover this change.
           - [ ] No code files have been changed.
           - [ ] Other reason
   
   - Behavior changed:
       - [x] Yes. Corrects query result semantics for affected null-aware mark 
anti joins.
       - [ ] No.
   
   - Does this need documentation?
       - [x] No.
       - [ ] Yes.
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to