I'm trying to perform operations on a large RDD, that ends up being about 1.3
GB in memory when loaded in. It's being cached in memory during the first
operation, but when another task begins that uses the RDD, I'm getting this
error that says the RDD was lost:
14/06/30 09:48:17 INFO
I didn't fix the issue so much as work around it. I was running my cluster
locally, so using HDFS was just a preference. The code worked with the local
file system, so that's what I'm using until I can get some help.
--
View this message in context:
I've been trying to figure out how to increase the heap space for my spark
environment in 1.0.0, and all of the things I've found tell me I have export
something in Java Opts, which is deprecated in 1.0.0, or in increase the
spark.executor.memory, which is at 6G. I'm only trying to process about
I've been trying to figure out how to increase the heap space for my spark
environment in 1.0.0, and all of the things I've found tell me I have export
something in Java Opts, which is deprecated in 1.0.0, or in increase the
spark.executor.memory, which is at 6G. I'm only trying to process about
I can write one if you'll point me to where I need to write it.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/wholeTextFiles-not-working-with-HDFS-tp7490p7737.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
My exception stack looks about the same.
java.io.FileNotFoundException: File /user/me/target/capacity-scheduler.xml
does not exist.
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
at