westonpace commented on issue #34911:
URL: https://github.com/apache/arrow/issues/34911#issuecomment-1502372242

   We spoke about this a bit externally but to record the conversation:
   
   `first` and `last` can be written in two different ways.  The unary version 
`first(expr)` is a window aggregate.  It is an aggregate function that depends 
on the order of the data.  There is also a binary variant (often called 
`arg_min` and `arg_max`) which returns the smallest value in column X given 
column Y.  The advantage of the binary variant is that it doesn't depend on the 
order.  However, it is a little less flexible (e.g. no way to specify custom 
sort function, not that we support that yet anyways :)).
   
   From our conversation, my understanding is that you are interested in the 
unary window-aggregate version.  We don't have any window aggregates yet but I 
do think it would be a good idea to start adding some.  We have pretty much all 
the building blocks we need to support a "window aggregate node" (or an 
extension to the current aggregate nodes).  Furthermore, even if we don't build 
in proper multithreaded support for window functions it should be possible to 
use window aggregates today as long as your plan is single threaded.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to