Hi, I am trying to find where Spark persists RDDs when we call the persist() api and executed under YARN. This is purely for understanding...
In my driver program, I wait indefinitely, so as to avoid any clean up problems. In the actual job, I roughly do the following: JavaRDD<String> lines = context.textFile(args[0]); lines.persist(StorageLevel.DISK_ONLY()); lines.collect(); When run with local executor, I can see that the files (like rdd_1_0) are persisted under directories like /var/folders/mt/51srrjc15wl3n829qkgnh2dm0000gp/T/spark-local-20150118201458-6147/15. Where similarly can I find these under Yarn ? Thanks hemanth