GitHub user gengliangwang opened a pull request:

    https://github.com/apache/spark/pull/21532

    [SPARK-24524][SQL]Improve aggregateMetrics: reduce memory usage and number 
of loops

    ## What changes were proposed in this pull request?
    
    The function `aggregateMetrics` process metrics from both executors and 
driver. The data can be large. 
    
    This PR is to improve the implementation with one loop(before converting to 
string) and one dynamic data structure.
    
    
    ## How was this patch tested?
    
    Unit test


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gengliangwang/spark aggMetrics

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21532.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21532
    
----
commit 0ce71c09bf5593c16e0eff5ae6e4aa3bd4c6ca26
Author: Gengliang Wang <gengliang.wang@...>
Date:   2018-06-11T21:32:11Z

    Improve aggregateMetrics with less memory usage and loops

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to