[PR] [SPARK-45881][SQL] support Higher Order aggregate functions from SQL [spark]

via GitHub Fri, 10 Nov 2023 01:45:40 -0800


steven-aerts opened a new pull request, #43752:
URL: https://github.com/apache/spark/pull/43752


   In Spark you can define, implement and use Higher Order Aggregate functions 
from the scala API by implementing a case class which extends from 
TypedImperativeAggregate and add the HigherOrderFunction trait.
   
   ### What changes were proposed in this pull request?
   With this commit you can also use them from Spark SQL, as the analyzer is 
now aware of their existence.
   
   ### Why are the changes needed?
   Make the Analyzer aware.
   
   ### Does this PR introduce _any_ user-facing change?
   This change is not exposing any user facing changes.  It will however allow 
us to introduce higher order aggregate functions to the spark standard library.
   
   ### How was this patch tested?
   This patch was tested on a custom higher order function we developed for our 
custom/internal and is not part of this PR.
   
   It is called has the following signature `map_merge(value, 
merge_function(v1, v2))` it allows you to merge maps and resolve conflicts.  It 
is probably too specific to add to the list of standard spark functions, and if 
it does it will need to be extended a bit to be more generic.
   
   We considered introducing a custom unit tests, but did not see an easy way 
to do it while keeping things simple and easy.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-45881][SQL] support Higher Order aggregate functions from SQL [spark]

Reply via email to