bvolpato opened a new pull request, #22630:
URL: https://github.com/apache/datafusion/pull/22630

   ## Which issue does this PR close?
   
   - Closes #22629.
   
   ## Rationale for this change
   
   The Substrait `ProjectRel` consumer only added a `WindowAggr` relation when 
the root projected expression was a `WindowFunction`. A scalar expression 
wrapping a valid window function therefore left the window directly inside 
`Projection`, which cannot be physically planned.
   
   Minimal reproducer represented by the added Substrait fixture:
   
   ```sql
   SELECT 1 + count(*) OVER () FROM DATA;
   ```
   
   Before this patch, the regression test produced this plan difference:
   
   ```diff
    Projection: Int64(1) + count(Int64(1)) ROWS BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING AS EXPR$0
   -  WindowAggr: windowExpr=[[count(Int64(1)) ROWS BETWEEN UNBOUNDED PRECEDING 
AND UNBOUNDED FOLLOWING]]
   -    TableScan: DATA
   +  TableScan: DATA
   ```
   
   This is the same class of failure that reaches physical planning as an 
unsupported nested `WindowFunction` expression.
   
   ## What changes are included in this PR?
   
   - Use existing `find_window_exprs(...)` recursion while consuming Substrait 
`ProjectRel` expressions, retaining current `HashSet` deduplication across 
projections.
   - Add a minimal Substrait JSON fixture with a window function nested inside 
an arithmetic scalar expression.
   - Add a logical-plan snapshot plus execution regression, proving 
`WindowAggr` is inserted and physically executable.
   
   ## Are these changes tested?
   
   Pre-fix evidence:
   
   - `cargo test -p datafusion-substrait --test substrait_integration 
nested_window_function_in_expression -- --nocapture` failed because the 
consumed plan omitted expected `WindowAggr` and put `Projection` directly above 
`TableScan`.
   
   Final validation on Apache `main` commit `d8c458828`:
   
   - `cargo fmt --all -- --check`
   - `cargo test -p datafusion-substrait` (49 unit passed, 200 integration 
passed, 3 doctests passed; 6 existing ignored)
   - `cargo check --all-targets -p datafusion-substrait`
   - `cargo check --no-default-features -p datafusion-substrait`
   - `cargo check --no-default-features -p datafusion-substrait 
--features=physical`
   - `cargo check --no-default-features -p datafusion-substrait 
--features=protoc`
   - `cargo clippy --all-targets --all-features -- -D warnings`
   - `./dev/rust_lint.sh`
   
   ## Are there any user-facing changes?
   
   Substrait producers may now send projected expressions that contain nested 
window functions; DataFusion consumes them into executable logical plans 
instead of leaving unsupported window expressions in projections.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to