viirya opened a new issue, #10785:
URL: https://github.com/apache/datafusion/issues/10785

   ### Is your feature request related to a problem or challenge?
   
   I'm trying to update Comet to latest DataFusion: 
https://github.com/apache/datafusion-comet/pull/403. One issue I found is about 
AggregateUDF.
   
   For some aggregate expressions, their implementation moves to 
`AggregateUDF`, e.g., `FirstValue` and `LastValue`.
   
   To create a `AggregateFunctionExpr` for them, we need to call 
`create_aggregate_expr` by providing some arguments like `input_phy_exprs`. 
`input_phy_exprs` is used to determine the UDF's return return type (i.e., 
`AggregateUDF.return_type`).
   
   To get the input physical expressions for each aggregate expression, we need 
to get slice of all input expressions which is the state fields of the 
aggregate expression. But the current design of `AggregateUDF` doesn't provide 
its state fields, but it relies on `AggregateFunctionExpr` to call 
`AggregateUDF.state_fields` with `StateFieldsArgs`.
   
   So it is a circular relationship for someone to create a 
`AggregateFunctionExpr` for a `AggregateUDF`:
   
   1. Create a `AggregateUDF`
   2. Call `create_aggregate_expr` to create `AggregateFunctionExpr`, by 
providing `input_phy_exprs`
   3. In order to get `input_phy_exprs`, we need to know how to slice input 
expressions, i.e., how many state fields for the UDF. Call 
`AggregateUDF.state_fields`. `AggregateUDF.state_fields` requires 
`StateFieldsArgs` from `AggregateFunctionExpr`, so we need to create 
`AggregateFunctionExpr` first (step 2).
   
   I think this is an issue in the current design. `AggregateUDF` should 
determine its state fields by itself instead relying on `AggregateFunctionExpr` 
which is a wrapper of it.
   
   
   
   
   
   
   
   
   
   
   
   ### Describe the solution you'd like
   
   _No response_
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to