[GitHub] [spark] cloud-fan opened a new pull request #24144: [SPARK-24935][SQL] fix Hive UDAF with two aggregation buffers

GitBox Tue, 19 Mar 2019 05:04:59 -0700

cloud-fan opened a new pull request #24144: [SPARK-24935][SQL] fix Hive UDAF 
with two aggregation buffers
URL: https://github.com/apache/spark/pull/24144
 
 
   ## What changes were proposed in this pull request?
   
   Hive UDAF knows the aggregation mode when creating the aggregation buffer, 
so that it can create different buffers for different inputs: the original data 
or the aggregation buffer. Please see an example in the [sketches 
library](https://github.com/DataSketches/sketches-hive/blob/7f9e76e9e03807277146291beb2c7bec40e8672b/src/main/java/com/yahoo/sketches/hive/cpc/DataToSketchUDAF.java#L107).
   
   However, the Hive UDAF adapter in Spark always creates the buffer with 
partial1 mode, which can only deal with one input: the original data. This PR 
fixes it.
   
   ## How was this patch tested?
   
   a new test


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan opened a new pull request #24144: [SPARK-24935][SQL] fix Hive UDAF with two aggregation buffers

Reply via email to