GitHub user dhruve opened a pull request:

    https://github.com/apache/spark/pull/22015

    [SPARK-20286] Release executors on unpersisting RDD

    ## What changes were proposed in this pull request?
    Currently, the executors acquired using dynamic allocation are not released 
when the cached RDD is unpersisted. This leads to wasting unnecessary grid 
resources. With this change, once the cached RDD is unpersisted, we check if 
the executor has any running tasks or not. If not then we do the following:
    1 - If the executor has cached RDD blocks from other RDDs, we don't make 
any change.
    2 - If the executor has no more cached RDD blocks and tasks running, we 
update the removal time based on the conf 
`spark.dynamicAllocation.cachedExecutorIdleTimeout` so the idle executor can be 
released back. 
    
    ## How was this patch tested?
    Manually using a code snippet. 
    ``` scala
    val rdd = sc.textFile("smallFile")
    rdd.cache
    
    val rdd2 = sc.textFile("largeFile")
    rdd2.cache
    
    rdd2.count // Cached data on around 500+ executors
    Thread.sleep(30000) // sleep for 30s
    rdd.count // Cached data on around 20 executors
    
    // Verify only 20 executors remain, rest will timeout based on idleTimeout 
which i set to 60s
    rdd2.unpersist 
    
    // eventunally  all executors will be released as there are no tasks 
running on any executor.
    ```


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dhruve/spark bug/SPARK-20286

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22015.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22015
    
----
commit 7e229dc120d0f1542cc8d7dbac1027baac36665e
Author: Dhruve Ashar <dhruveashar@...>
Date:   2018-08-06T20:32:47Z

    [SPARK-20286] Release executors on unpersisting RDD

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to