geoffreyclaude opened a new issue, #17417:
URL: https://github.com/apache/datafusion/issues/17417

   ### Is your feature request related to a problem or challenge?
   
   - **Problem**: When serializing logical plans to proto, 
`Expr::WindowFunction` drops `null_treatment`, `distinct`, and `filter`. This 
prevents round-tripping plans that use SQL features like `IGNORE NULLS`, 
`DISTINCT` in window aggregates, or `FILTER (WHERE ...)`, breaking remote 
execution and plan persistence.
   - **Evidence**:
   ```308:322:datafusion/proto/src/logical_plan/to_proto.rs
          Expr::WindowFunction(window_fun) => {
              let expr::WindowFunction {
                  ref fun,
                  params:
                      expr::WindowFunctionParams {
                          ref args,
                          ref partition_by,
                          ref order_by,
                          ref window_frame,
                          // TODO: support null treatment in proto
                          null_treatment: _,
                          distinct: _,
                          filter: _,
                      },
              } = window_fun.as_ref();
   ```
   - **Impact**: Plans using queries such as:
     - `last_value(x) IGNORE NULLS OVER (...)`
     - `count(DISTINCT x) OVER (PARTITION BY ...)`
     - `sum(x) FILTER (WHERE y > 0) OVER (ORDER BY ...)`
     cannot be faithfully serialized/deserialized.
   
   ### Describe the solution you'd like
   
   - **Proto schema**: Extend the window function proto message to include:
     - **null_treatment**: enum with values like `UNSPECIFIED` (default), 
`RESPECT_NULLS`, `IGNORE_NULLS`.
     - **distinct**: boolean flag (default `false`).
     - **filter**: optional expression node representing the filter predicate.
   - **Conversions**:
     - Update `to_proto.rs` to populate these fields from 
`expr::WindowFunctionParams`.
     - Update `from_proto.rs` to reconstruct `WindowFunctionParams` from proto.
   - **Defaults & compatibility**:
     - Omit fields should preserve current behavior: 
`null_treatment=UNSPECIFIED` (treat as “respect nulls”), `distinct=false`, 
`filter=None`.
     - Use new field numbers to avoid wire compatibility issues; maintain 
backward compatibility with older binaries that ignore unknown fields.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   - **Primary locations**:
     - `datafusion/proto/src/logical_plan/to_proto.rs` and corresponding 
`from_proto.rs`.
     - Proto definitions for window expressions (the message that carries 
window function params).
   - **Acceptance criteria**:
     - Round-trip tests for:
       - `IGNORE NULLS` and `RESPECT NULLS` window functions (e.g., 
`last_value`, `first_value`).
       - `DISTINCT` window aggregates (e.g., `count(distinct x)`).
       - `FILTER (WHERE ...)` on window aggregates.
     - Existing plans without these features continue to round-trip unchanged.
     - Wire-compat maintained: older consumers ignore the new fields; newer 
consumers default correctly when fields are absent.
     - Docs updated to state proto supports these window-function features.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to