[GitHub] [spark] cloud-fan commented on issue #24144: [SPARK-24935][SQL] fix Hive UDAF with two aggregation buffers

GitBox Sun, 24 Mar 2019 15:54:55 -0700

cloud-fan commented on issue #24144: [SPARK-24935][SQL] fix Hive UDAF with two 
aggregation buffers
URL: https://github.com/apache/spark/pull/24144#issuecomment-476009701
 
 
   The 4 modes exactly match what Spark has, although the names are a little 
different. partial2 is called partial-merge in Spark.
   
   The problem here is, Hive UDAF can know the mode during initialization, 
while Spark can't. Technically Hive UDAF can pick a different buffer 
implementation for each mode, and to fully support it we need to refactor the 
Spark aggregate framework to give mode to Spark UDAF as well. This is overkill 
IMO and this patch is a best-effort to work around it. I think Hive UDAF will 
only pick a different buffer implementation for different kinds of 
inputs(original record or agg buffer), which is the case of the sketches 
library.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on issue #24144: [SPARK-24935][SQL] fix Hive UDAF with two aggregation buffers

Reply via email to