[GitHub] spark issue #20806: [SPARK-23661][SQL] Implement treeAggregate on Dataset AP...

viirya Wed, 14 Mar 2018 19:39:11 -0700

Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/20806
  
    @WeichenXu123 I feel `groupBy` is more SQL-like aggregation by which we can 
specify a key to grouping by. At least `rdd.treeAggregate` does not support 
key-specified aggregation.
    
    For typed grouping `groupByKey`, it constructs `KeyValueGroupedDataset` by 
which we rely on SQL `Aggregate` execution to grouping data. Currently it 
doesn't support tree-based aggregation.
    
    This work doesn't intend to overhaul SQL aggregation to support tree-based 
aggregation. So the API will looks more like as is.




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #20806: [SPARK-23661][SQL] Implement treeAggregate on Dataset AP...

Reply via email to