[
https://issues.apache.org/jira/browse/SPARK-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606612#comment-14606612
]
Arpit Agarwal commented on SPARK-6112:
--------------------------------------
Seems to work fine with Spark 1.4.0 + Hadoop 2.7 also. Are your storage types
tagged correctly in hdfs-site.xml? You will need to tag the RAM disk path in
{{dfs.datanode.data.dir}} with \[RAM_DISK\]. e.g. from my test machine.
{code}
<property>
<name>dfs.data.dir</name>
<value>[RAM_DISK]/mnt/tmpfs/hadoop/data,[DISK]/mnt/sdb/hadoop/data</value>
</property>
{code}
You can test without spark by simply copying a file to the target directory
with the policy and verifying its block files go to RAM disk.
> Provide external block store support through HDFS RAM_DISK
> ----------------------------------------------------------
>
> Key: SPARK-6112
> URL: https://issues.apache.org/jira/browse/SPARK-6112
> Project: Spark
> Issue Type: New Feature
> Components: Block Manager
> Reporter: Zhan Zhang
> Attachments: SparkOffheapsupportbyHDFS.pdf
>
>
> HDFS Lazy_Persist policy provide possibility to cache the RDD off_heap in
> hdfs. We may want to provide similar capacity to Tachyon by leveraging hdfs
> RAM_DISK feature, if the user environment does not have tachyon deployed.
> With this feature, it potentially provides possibility to share RDD in memory
> across different jobs and even share with jobs other than spark, and avoid
> the RDD recomputation if executors crash.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]