GitHub user viirya opened a pull request:

    https://github.com/apache/spark/pull/22341

    [SPARK-24889][Core] Update block info when unpersist rdds

    ## What changes were proposed in this pull request?
    
    We will update block info coming from executors, at the timing like caching 
a RDD. However, when removing RDDs with unpersisting, we don't ask to update 
block info. So the block info is not updated.
    
    We can fix this with few options:
    
    1. Ask to update block info when unpersisting
    
    This is simplest but changes driver-executor communication a bit.
    
    2. Update block info when processing the event of unpersisting RDD
    
    We send a `SparkListenerUnpersistRDD` event when unpersisting RDD. When 
processing this event, we can update block info of the RDD. This only changes 
event processing code so the risk seems to be lowest.
    
    Currently this patch takes option 2 for lowest risk. If we agree first 
option has no risk, we can change to it.
    
    ## How was this patch tested?
    
    Unit tests.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/viirya/spark-1 SPARK-24889

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22341.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22341
    
----
commit dd5f766e0f270cfc58ca4298c39179469f021f78
Author: Liang-Chi Hsieh <viirya@...>
Date:   2018-08-30T23:17:46Z

    Update memory and disk info when unpersist rdds.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to