[jira] [Created] (SPARK-12147) Off heap storage and dynamicAllocation operation

Rares Mirica (JIRA) Fri, 04 Dec 2015 07:27:35 -0800

Rares Mirica created SPARK-12147:
------------------------------------

             Summary: Off heap storage and dynamicAllocation operation
                 Key: SPARK-12147
                 URL: https://issues.apache.org/jira/browse/SPARK-12147
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 1.5.2
         Environment: Cloudera Hadoop 2.6.0-cdh5.4.8
Tachyon 0.7.1
Yarn
            Reporter: Rares Mirica



For the purpose of increasing computation density and efficiency I set up to 
test off-heap storage (using Tachyon) with dynamicAllocation enabled.

Following the available documentation (programming-guide for Spark 1.5.2) I was 
expecting data to be cached in Tachyon for the lifetime of the application 
(driver instance) or until unpersist() is called. This belief was supported by 
the doc: "Cached data is not lost if individual executors crash." where with 
crash I also assimilate Graceful Decommission. Furthermore, in the GD 
description documented in the job-scheduling document cached data preservation 
through off-heap storage is also hinted at.

Seeing how Tachyon is now in a state where these promises of a better future 
are well within reach, I consider it a bug that upon graceful decommission of 
an executor the off-heap data is deleted (presumably as part of the cleanup 
phase).

Needless to say, enabling the preservation of the off-heap persisted data after 
graceful decommission for dynamic allocation would yield significant 
improvements in resource allocation, especially over yarn where executors use 
up compute "slots" even if idle. After a long, expensive, computation where we 
take advantage of the dynamically scaled executors, the rest of the spark jobs 
can use the cached data while releasing the compute resources for other cluster 
tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-12147) Off heap storage and dynamicAllocation operation

Reply via email to