[GitHub] spark pull request #19679: [SPARK-20647][core] Port StorageTab to the new UI...

vanzin Mon, 06 Nov 2017 13:03:54 -0800

GitHub user vanzin opened a pull request:

    https://github.com/apache/spark/pull/19679


    [SPARK-20647][core] Port StorageTab to the new UI backend.

    This required adding information about StreamBlockId to the store,
    which is not available yet via the API. So an internal type was added
    until there's a need to expose that information in the API.
    
    The UI only lists RDDs that have cached partitions, and that information
    wasn't being correctly captured in the listener, so that's also fixed,
    along with some minor (internal) API adjustments so that the UI can
    get the correct data.
    
    Because of the way partitions are cached, some optimizations w.r.t. how
    often the data is flushed to the store could not be applied to this code;
    because of that, some different ways to make the code more performant
    were added to the data structures tracking RDD blocks, with the goal of
    avoiding expensive copies when lots of blocks are being updated.
    
    Tested with existing and updated unit tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vanzin/spark SPARK-20647

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19679.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19679
    
----
commit 7147bd241b8acd6a944d3bba9170f98f8233cc3b
Author: Marcelo Vanzin <[email protected]>
Date:   2017-01-30T22:48:30Z

    [SPARK-20647][core] Port StorageTab to the new UI backend.
    
    This required adding information about StreamBlockId to the store,
    which is not available yet via the API. So an internal type was added
    until there's a need to expose that information in the API.
    
    The UI only lists RDDs that have cached partitions, and that information
    wasn't being correctly captured in the listener, so that's also fixed,
    along with some minor (internal) API adjustments so that the UI can
    get the correct data.
    
    Because of the way partitions are cached, some optimizations w.r.t. how
    often the data is flushed to the store could not be applied to this code;
    because of that, some different ways to make the code more performant
    were added to the data structures tracking RDD blocks, with the goal of
    avoiding expensive copies when lots of blocks are being updated.
    
    Tested with existing and updated unit tests.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19679: [SPARK-20647][core] Port StorageTab to the new UI...

Reply via email to