Hi thanks for the response. It looks like YARN container is getting killed but dont know why I see shuffle metafetchexception as mentioned in the following SO link. I have enough memory 8 nodes 8 cores 30 gig memory each. And because of this metafetchexpcetion YARN killing container running executor how can it over run memory I tried to give each executor 25 gig still it is not sufficient and it fails. Please guide I dont understand what is going on I am using Spark 1.4.0 I am using spark.shuffle.memory as 0.0 and spark.storage.memory as 0.5. I have almost all optimal properties like Kyro serializer I have kept 500 akka frame size 20 akka threads dont know I am trapped its been two days I am trying to recover from this issue.
http://stackoverflow.com/questions/29850784/what-are-the-likely-causes-of-org-apache-spark-shuffle-metadatafetchfailedexcept On Thu, Jul 30, 2015 at 9:56 PM, Ashwin Giridharan <ashwin.fo...@gmail.com> wrote: > What is your cluster configuration ( size and resources) ? > > If you do not have enough resources, then your executor will not run. > Moreover allocating 8 cores to an executor is too much. > > If you have a cluster with four nodes running NodeManagers, each equipped > with 4 cores and 8GB of memory, > then an optimal configuration would be, > > --num-executors 8 --executor-cores 2 --executor-memory 2G > > Thanks, > Ashwin > > On Thu, Jul 30, 2015 at 12:08 PM, unk1102 <umesh.ka...@gmail.com> wrote: > >> Hi I have one Spark job which runs fine locally with less data but when I >> schedule it on YARN to execute I keep on getting the following ERROR and >> slowly all executors gets removed from UI and my job fails >> >> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 8 on >> myhost1.com: remote Rpc client disassociated >> 15/07/30 10:18:13 ERROR cluster.YarnScheduler: Lost executor 6 on >> myhost2.com: remote Rpc client disassociated >> I use the following command to schedule spark job in yarn-client mode >> >> ./spark-submit --class com.xyz.MySpark --conf >> "spark.executor.extraJavaOptions=-XX:MaxPermSize=512M" >> --driver-java-options >> -XX:MaxPermSize=512m --driver-memory 3g --master yarn-client >> --executor-memory 2G --executor-cores 8 --num-executors 12 >> /home/myuser/myspark-1.0.jar >> >> I dont know what is the problem please guide. I am new to Spark. Thanks in >> advance. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-control-Spark-Executors-from-getting-Lost-when-using-YARN-client-mode-tp24084.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > > > -- > Thanks & Regards, > Ashwin Giridharan >