westonpace commented on PR #33846: URL: https://github.com/apache/arrow/pull/33846#issuecomment-1402581119
A scalar function produces 1 output row per input row and the and is stateless (the answer is completely determined by the 1 input row) An aggregate function produces only 1 output row and is stateful (the answer may depend on more than just a single input row) A window function produces 1 output row per input (similar to a scalar function) and is stateful (similar to an aggregate function) The "window" you use to determine the output can be based on a number of things but a very common window is "all rows up to this point ordered by X" E.g. "rank", in postgres, is expressed as: [`SELECT RANK () OVER ( ORDER BY c ) FROM my_table;`](http://sqlfiddle.com/#!17/261c7/1) You can also compute the rank within the context of another column (e.g. month): `SELECT val, RANK () OVER ( PARTITION BY month ORDER BY val ) FROM my_table;` Also note that all aggregate functions can also be used as window functions. In that case you simply broadcast the result of the aggregate to all rows in window. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
