GitHub user viirya opened a pull request:

    [SPARK-23661][SQL] Implement treeAggregate on Dataset API

    ## What changes were proposed in this pull request?
    Many algorithms in MLlib are still not migrated their internal computing 
workload from RDD to DataFrame. `treeAggregate` is one of obstacles we need to 
address in order to see complete migration.
    This patch is submitted to provide `treeAggregate` on Dataset API. For now 
this should be a private API used by ML component.
    The approach of tree aggregation imitates RDD's `treeAggregate`.
    ## How was this patch tested?
    Added unit test.

You can merge this pull request into a Git repository by running:

    $ git pull treeAggregate

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20806
commit a254d1501c0119b4881c0443f28c263f0c9dec0e
Author: Liang-Chi Hsieh <viirya@...>
Date:   2018-03-12T08:41:20Z

    Implement treeAggregate on Dataset API.



To unsubscribe, e-mail:
For additional commands, e-mail:

Reply via email to