A small update: I was able to find a solution with good performance - using
brickhouse collect (Hive UDAF). This also accept structs as an input, which
is an ok workaround, but not perfect still (support for UDTs would be
better). The built-in hive 'collect_list' seems to have a check for input
par
Hi,
Issue #1:
I'm using the new UDAF interface (UserDefinedAggregateFunction) at Spark
1.5.0 release. Is it possible to aggregate all values in the
MutableAggregationBuffer into an array in a robust manner? I'm creating an
aggregation function that collects values into an array from all input rows