steven-aerts opened a new pull request, #43752: URL: https://github.com/apache/spark/pull/43752
In Spark you can define, implement and use Higher Order Aggregate functions from the scala API by implementing a case class which extends from TypedImperativeAggregate and add the HigherOrderFunction trait. ### What changes were proposed in this pull request? With this commit you can also use them from Spark SQL, as the analyzer is now aware of their existence. ### Why are the changes needed? Make the Analyzer aware. ### Does this PR introduce _any_ user-facing change? This change is not exposing any user facing changes. It will however allow us to introduce higher order aggregate functions to the spark standard library. ### How was this patch tested? This patch was tested on a custom higher order function we developed for our custom/internal and is not part of this PR. It is called has the following signature `map_merge(value, merge_function(v1, v2))` it allows you to merge maps and resolve conflicts. It is probably too specific to add to the list of standard spark functions, and if it does it will need to be extended a bit to be more generic. We considered introducing a custom unit tests, but did not see an easy way to do it while keeping things simple and easy. ### Was this patch authored or co-authored using generative AI tooling? No -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
