RE: spark.local.dir and spark.worker.dir not used

Shao, Saisai Tue, 23 Sep 2014 05:47:06 -0700

This folder will be created when you start your Spark application under your 
spark.local.dir, with the name “spark-local-xxx” as prefix. It’s quite strange 
you don’t see this folder, maybe you miss something. Besides if Spark cannot 
create this folder on start, persist rdd to disk will be failed.

Also I think there’s no way to persist RDD to HDFS, even in YARN, only RDD’s 
checkpoint can save data on HDFS.

Thanks
Jerry

From: Chitturi Padma [mailto:learnings.chitt...@gmail.com]
Sent: Tuesday, September 23, 2014 8:33 PM
To: u...@spark.incubator.apache.org
Subject: Re: spark.local.dir and spark.worker.dir not used

I couldnt even see the spark-<id> folder in the default /tmp directory of 
local.dir.........

On Tue, Sep 23, 2014 at 6:01 PM, Priya Ch <[hidden 
email]</user/SendEmail.jtp?type=node&node=14887&i=0>> wrote:
Is it possible to view the persisted RDD blocks ?
If I use YARN, RDD blocks would be persisted to hdfs then will i be able to 
read the hdfs blocks as i could do in hadoop ?

On Tue, Sep 23, 2014 at 5:56 PM, Shao, Saisai [via Apache Spark User List] 
<[hidden email]</user/SendEmail.jtp?type=node&node=14887&i=1>> wrote:
Hi,

Spark.local.dir is the one used to write map output data and persistent RDD 
blocks, but the path of  file has been hashed, so you cannot directly find the 
persistent rdd block files, but definitely it will be in this folders on your 
worker node.

Thanks
Jerry

From: Priya Ch [mailto:[hidden 
email]<http://user/SendEmail.jtp?type=node&node=14885&i=0>]
Sent: Tuesday, September 23, 2014 6:31 PM
To: [hidden email]<http://user/SendEmail.jtp?type=node&node=14885&i=1>; [hidden 
email]<http://user/SendEmail.jtp?type=node&node=14885&i=2>
Subject: spark.local.dir and spark.worker.dir not used

Hi,

I am using spark 1.0.0. In my spark code i m trying to persist an rdd to disk 
as rrd.persist(DISK_ONLY). But unfortunately couldn't find the location where 
the rdd has been written to disk. I specified SPARK_LOCAL_DIRS and 
SPARK_WORKER_DIR to some other location rather than using the default /tmp 
directory, but still couldnt see anything in worker directory andspark ocal 
directory.

I also tried specifying the local dir and worker dir from the spark code while 
defining the SparkConf as conf.set("spark.local.dir", "/home/padma/sparkdir") 
but the directories are not used.

In general which directories spark would be using for map output files, 
intermediate writes and persisting rdd to disk ?

Thanks,
Padma Ch

________________________________
If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/spark-local-dir-and-spark-worker-dir-not-used-tp14881p14885.html
To start a new topic under Apache Spark User List, email [hidden 
email]</user/SendEmail.jtp?type=node&node=14887&i=2>
To unsubscribe from Apache Spark User List, click here.
NAML<http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>

________________________________
View this message in context: Re: spark.local.dir and spark.worker.dir not 
used<http://apache-spark-user-list.1001560.n3.nabble.com/spark-local-dir-and-spark-worker-dir-not-used-tp14881p14887.html>
Sent from the Apache Spark User List mailing list 
archive<http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.

RE: spark.local.dir and spark.worker.dir not used

Reply via email to