GitHub user uncleGen opened a pull request:

    https://github.com/apache/spark/pull/2131

    [SPARK-3170][CORE][BUG]:RDD info loss in "StorageTab" and "ExecutorTab"

    compeleted stage only need to remove its own partitions that are no longer 
cached. However, "StorageTab" may lost some rdds which are cached actually. Not 
only in "StorageTab", "ExectutorTab" may also lose some rdd info which have 
been overwritten by last rdd in a same task.
    1. "StorageTab": when multiple stages run simultaneously, completed stage 
will remove rdd info which belong to other stages that are still running.
    2. "ExectutorTab": taskcontext may lose some "updatedBlocks" info of  rdds  
in a dependency chain. Like the following example:
             val r1 = sc.paralize(..).cache()
             val r2 = r1.map(...).cache()
             val n = r2.count()
    
    When count the r2, r1 and r2 will be cached finally. So in 
CacheManager.getOrCompute, the taskcontext should contain "updatedBlocks" of r1 
and r2. Currently, the "updatedBlocks" only contain the info of r2. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/uncleGen/spark master_ui_fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2131.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2131
    
----
commit c82ba82ae90c92244e63811f30e1aeb05608c57a
Author: uncleGen <[email protected]>
Date:   2014-08-26T06:54:04Z

    Bug Fix: RDD info loss in "StorageTab" and "ExecutorTab"

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to