(- incubator list, + user list) (Answer copied from original posting at http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Spark-app-throwing-java-lang-OutOfMemoryError-GC-overhead-limit/m-p/16396#U16396 -- let's follow up one place. If it's not specific to CDH, this is a good place to record a solution.)
Where does the out of memory exception occur? in your driver, or an executor? I assume it is an executor. Yes, you are using the default of 512MB per executor. You can raise that with properties like spark.executor.memory, or flags like --executor-memory if using spark-shell. It sounds like your workers are allocating 2GB for executors, so you could potentially use up to 2GB per executor and your 1 executor per machine would consume all of your Spark cluster memory. But more memory doesn't necessarily help if you're performing some operation that inherently allocates a great deal of memory. I'm not sure what your operations are. Keep in mind too that if you are caching RDDs in memory, this is taking memory away from what's available for computations. On Mon, Aug 4, 2014 at 7:03 PM, buntu <buntu...@gmail.com> wrote: > I got a 40 node cdh 5.1 cluster and attempting to run a simple spark app that > processes about 10-15GB raw data but I keep running into this error: > > java.lang.OutOfMemoryError: GC overhead limit exceeded > > Each node has 8 cores and 2GB memory. I notice the heap size on the > executors is set to 512MB with total heap size on each executor is set to > 2GB. Wanted to know whats the heap size needs to be set to for such data > sizes and if anyone had input on other config changes that will help as > well. > > Thanks for the input! > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-app-throwing-java-lang-OutOfMemoryError-GC-overhead-limit-exceeded-tp11350.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org