Hi,

I am trying to find where Spark persists RDDs when we call the persist()
api and executed under YARN. This is purely for understanding...

In my driver program, I wait indefinitely, so as to avoid any clean up
problems.

In the actual job, I roughly do the following:

JavaRDD<String> lines = context.textFile(args[0]);
lines.persist(StorageLevel.DISK_ONLY());
lines.collect();

When run with local executor, I can see that the files (like rdd_1_0) are
persisted under directories like
/var/folders/mt/51srrjc15wl3n829qkgnh2dm0000gp/T/spark-local-20150118201458-6147/15.

Where similarly can I find these under Yarn ?

Thanks
hemanth

Reply via email to