crm26 commented on PR #21363:
URL: https://github.com/apache/datafusion/pull/21363#issuecomment-4525879360

   Thanks for the review and the conflict-resolution ask. The rebase went 
smoothly except for one file: 
`datafusion/optimizer/src/decorrelate_predicate_subquery.rs`. Main has 
refactored the Filter handling — flattening it out of the `match plan` and 
adding the `original_schema` re-project from #21265. Our PR adds a second arm 
to that same `match` for `LogicalPlan::Projection`, so I needed the `match` 
structure back. I restored it and moved the `original_schema` re-project inside 
the Filter arm, so we still inherit #21265's fix for that path.
   
   With that done, the test suite is mostly green, but one SLT case now fails: 
the two-InSubquery-in-one-SELECT example at `predicates.slt:846`:
   
   ```sql
   WITH empty AS (SELECT 10 WHERE false)
   SELECT
       NULL IN (SELECT * FROM empty),
       NULL NOT IN (SELECT * FROM empty)
   FROM (SELECT 1) t;
   ```
   
   The failure mode is the same one #21265 addressed for the Filter path: each 
`rewrite_inner_subqueries` call builds a join producing a column named `mark`, 
and two of those in a single Projection collide. `optimize_projections` rejects 
the plan with `Schema contains duplicate unqualified field name mark`.
   
   When I last validated this PR in April, that case passed. It no longer does, 
because the duplicate-name check in `optimize_projections` was tightened in May 
by #22389 and #21726. Our PR has not changed; the optimizer contract has.
   
   The fix from #21265 — re-projecting the Filter output back to the original 
schema width to strip the mark columns — does not transfer cleanly to our 
Projection arm. In our case the final Projection's expressions reference those 
mark columns directly to form the rewritten predicates, so stripping the 
columns would invalidate the expressions. Disambiguating the marks (aliasing 
each one uniquely as `rewrite_inner_subqueries` produces it, or rewriting them 
at the Projection-arm boundary before we build the final plan) seems to be the 
right shape of fix.
   
   My question for you: should that disambiguation land in this PR, or would 
you rather merge what's here once I've reverted `predicates.slt:846` to its 
prior `query error` form, and handle the multi-InSubquery case as a follow-up 
that mirrors what #21265 did for Filter?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to