[ 
https://issues.apache.org/jira/browse/SPARK-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596297#comment-14596297
 ] 

Bogdan Ghit commented on SPARK-6112:
------------------------------------

I'd rather use tmpfs, because ramdisk needs root. However, I don't think the 
last parameter is the issue. 

More details:

I set LAZY_PERSIST policy on /tmp/spark-dfs and the following parameters for 
Spark:
spark.externalBlockStore.baseDir                  /tmp/spark-dfs
spark.externalBlockStore.folderName           external
spark.externalBlockStore.blockManager       
org.apache.spark.storage.HDFSBlockManager
spark.externalBlockStore.url                          
hdfs://master.ib.cluster:54321

When I import data to HDFS with hadoop -copyFromLocal data /tmp/spark-dfs/, the 
data goes in tmpfs.
When I write the output of an application in Spark with 
saveAsTextFile("/tmp/spark-dfs/output"), the data goes to disk.

> Provide external block store support through HDFS RAM_DISK
> ----------------------------------------------------------
>
>                 Key: SPARK-6112
>                 URL: https://issues.apache.org/jira/browse/SPARK-6112
>             Project: Spark
>          Issue Type: New Feature
>          Components: Block Manager
>            Reporter: Zhan Zhang
>         Attachments: SparkOffheapsupportbyHDFS.pdf
>
>
> HDFS Lazy_Persist policy provide possibility to cache the RDD off_heap in 
> hdfs. We may want to provide similar capacity to Tachyon by leveraging hdfs 
> RAM_DISK feature, if the user environment does not have tachyon deployed. 
> With this feature, it potentially provides possibility to share RDD in memory 
> across different jobs and even share with jobs other than spark, and avoid 
> the RDD recomputation if executors crash. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to