Re: Spark executor OOM issue on YARN
Hi, Can you please post your stack trace with exceptions? and also command line attributes in spark-submit? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-executor-OOM-issue-on-YARN-tp24522p24530.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark executor OOM issue on YARN
Hi Ted thanks I know by default spark.sql.shuffle.partition are 200. It would be great if you help me solve OOM issue. On Mon, Aug 31, 2015 at 11:43 PM, Ted Yu wrote: > Please see this thread w.r.t. spark.sql.shuffle.partitions : > http://search-hadoop.com/m/q3RTtE7JOv1bDJtY > > FYI > > On Mon, Aug 31, 2015 at 11:03 AM, unk1102 wrote: > >> Hi I have Spark job and its executors hits OOM issue after some time and >> my >> job hangs because of it followed by couple of IOException, Rpc client >> disassociated, shuffle not found etc >> >> I have tried almost everything dont know how do I solve this OOM issue >> please guide I am fed up now. Here what I tried but nothing worked >> >> -I tried 60 executors with each executor having 12 Gig/2 core >> -I tried 30 executors with each executor having 20 Gig/2 core >> -I tried 40 executors with each executor having 30 Gig/6 core (I also >> tried >> 7 and 8 core) >> -I tried to set spark.storage.memoryFraction to 0.2 in order to solve OOM >> issue I also tried to set it 0.0 >> -I tried to set spark.shuffle.memoryFraction to 0.4 since I need more >> shuffling memory >> -I tried to set spark.default.parallelism to 500,1000,1500 but it did not >> help avoid OOM what is the ideal value for it? >> -I also tried to set spark.sql.shuffle.partitions to 500 but it did not >> help >> it just creates 500 output part files. Please make me understand >> difference >> between spark.default.parallelism and spark.sql.shuffle.partitions. >> >> My data is skewed but not that much large I dont understand why it is >> hitting OOM I dont cache anything I jsut have four group by queries I am >> calling using hivecontext.sql(). I have around 1000 threads which I spawn >> from driver and each thread will execute these four queries. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-executor-OOM-issue-on-YARN-tp24522.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >
Re: Spark executor OOM issue on YARN
Please see this thread w.r.t. spark.sql.shuffle.partitions : http://search-hadoop.com/m/q3RTtE7JOv1bDJtY FYI On Mon, Aug 31, 2015 at 11:03 AM, unk1102 wrote: > Hi I have Spark job and its executors hits OOM issue after some time and my > job hangs because of it followed by couple of IOException, Rpc client > disassociated, shuffle not found etc > > I have tried almost everything dont know how do I solve this OOM issue > please guide I am fed up now. Here what I tried but nothing worked > > -I tried 60 executors with each executor having 12 Gig/2 core > -I tried 30 executors with each executor having 20 Gig/2 core > -I tried 40 executors with each executor having 30 Gig/6 core (I also tried > 7 and 8 core) > -I tried to set spark.storage.memoryFraction to 0.2 in order to solve OOM > issue I also tried to set it 0.0 > -I tried to set spark.shuffle.memoryFraction to 0.4 since I need more > shuffling memory > -I tried to set spark.default.parallelism to 500,1000,1500 but it did not > help avoid OOM what is the ideal value for it? > -I also tried to set spark.sql.shuffle.partitions to 500 but it did not > help > it just creates 500 output part files. Please make me understand difference > between spark.default.parallelism and spark.sql.shuffle.partitions. > > My data is skewed but not that much large I dont understand why it is > hitting OOM I dont cache anything I jsut have four group by queries I am > calling using hivecontext.sql(). I have around 1000 threads which I spawn > from driver and each thread will execute these four queries. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-executor-OOM-issue-on-YARN-tp24522.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
Spark executor OOM issue on YARN
Hi I have Spark job and its executors hits OOM issue after some time and my job hangs because of it followed by couple of IOException, Rpc client disassociated, shuffle not found etc I have tried almost everything dont know how do I solve this OOM issue please guide I am fed up now. Here what I tried but nothing worked -I tried 60 executors with each executor having 12 Gig/2 core -I tried 30 executors with each executor having 20 Gig/2 core -I tried 40 executors with each executor having 30 Gig/6 core (I also tried 7 and 8 core) -I tried to set spark.storage.memoryFraction to 0.2 in order to solve OOM issue I also tried to set it 0.0 -I tried to set spark.shuffle.memoryFraction to 0.4 since I need more shuffling memory -I tried to set spark.default.parallelism to 500,1000,1500 but it did not help avoid OOM what is the ideal value for it? -I also tried to set spark.sql.shuffle.partitions to 500 but it did not help it just creates 500 output part files. Please make me understand difference between spark.default.parallelism and spark.sql.shuffle.partitions. My data is skewed but not that much large I dont understand why it is hitting OOM I dont cache anything I jsut have four group by queries I am calling using hivecontext.sql(). I have around 1000 threads which I spawn from driver and each thread will execute these four queries. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-executor-OOM-issue-on-YARN-tp24522.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org