timsaucer commented on PR #11550:
URL: https://github.com/apache/datafusion/pull/11550#issuecomment-2242694088

   Copying @jayzhan211 's comment from 
   
   > I had tried to directly modify Expr before, but ends up builder API. I 
couldn't remember the reason
   
   My *guess* is that it has to do with all of the other functions returning 
`Expr` rather than `Result<Expr>` and to fit neatly in with the rest of the 
code base, you wanted one point of checking? Aside from that for Aggregate I 
don't see a strong need to have the builder. Actually, window functions have a 
better argument for them with the fact that you need to (sometimes) build the 
window frame if it is not specified.
   
   The reasons I can see sticking with a builder function instead of updating 
the `Expr` directly:
   
   1. Minimal change to the API. It *should* only require updating the import 
since all of the signatures in `AggregateExt` are identical in 
`ExprFunctionExt` and there are only new methods added in. We could even keep 
`AggregateExt` and mark it deprecated for a release or two.
   2. Builder allows checking for errors at one point instead of needing to do 
`?` after every operation. Right now there are two types of errors in 
`AggregateExt`, calling one of the functions on an Expr that is not an 
aggregate function or setting and `order_by` expression that is not a `Sort`.
   3. As mentioned above, it actually does make setting up the window functions 
cleaner. With the window functions, if the user specifies a window frame we 
want to use that. If not, we want to create it based on a few criteria. This is 
easier with a builder function so we can perform this operation once. In my 
alternate approach, I update `WindowsFunction` to make the frame optional. Then 
at the point of consuming the window function data I perform a 
`get_frame_or_default()` that I added in. However these uses I don't expect to 
be something the end user experiences and is mostly upon the DataFusion code to 
maintain.
   
   The reasons I can see preferring to work directly with the `Expr`:
   
   1. I think this does create a better user experience in the long run. We can 
perform our checks along the way, which I think is more idiomatic. Plus it 
relieves the user of needing to understand that we're going from an expression 
into a builder of an expression.
   2. The code to do this is in most ways simpler. We can remove the trait 
entirely, and simply have functions that are implemented on `Expr`. The notable 
exception to the simplification is the window frame on window functions 
mentioned above.
   
   Laying it out like this, I can see a good argument for sticking with the 
builder function if that's the way you two (and others) would like to go.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to