Ping!! Has anybody tested graceful shutdown of a spark streaming in yarn-cluster mode?It looks like a defect to me.
On Thu, May 12, 2016 at 12:53 PM Rakesh H (Marketing Platform-BLR) < rakes...@flipkart.com> wrote: > We are on spark 1.5.1 > Above change was to add a shutdown hook. > I am not adding shutdown hook in code, so inbuilt shutdown hook is being > called. > Driver signals that it is going to to graceful shutdown, but executor sees > that Driver is dead and it shuts down abruptly. > Could this issue be related to yarn? I see correct behavior locally. I did > "yarn kill ...." to kill the job. > > > On Thu, May 12, 2016 at 12:28 PM Deepak Sharma <deepakmc...@gmail.com> > wrote: > >> This is happening because spark context shuts down without shutting down >> the ssc first. >> This was behavior till spark 1.4 ans was addressed in later releases. >> https://github.com/apache/spark/pull/6307 >> >> Which version of spark are you on? >> >> Thanks >> Deepak >> >> On Thu, May 12, 2016 at 12:14 PM, Rakesh H (Marketing Platform-BLR) < >> rakes...@flipkart.com> wrote: >> >>> Yes, it seems to be the case. >>> In this case executors should have continued logging values till 300, >>> but they are shutdown as soon as i do "yarn kill ......" >>> >>> On Thu, May 12, 2016 at 12:11 PM Deepak Sharma <deepakmc...@gmail.com> >>> wrote: >>> >>>> So in your case , the driver is shutting down gracefully , but the >>>> executors are not. >>>> IS this the problem? >>>> >>>> Thanks >>>> Deepak >>>> >>>> On Thu, May 12, 2016 at 11:49 AM, Rakesh H (Marketing Platform-BLR) < >>>> rakes...@flipkart.com> wrote: >>>> >>>>> Yes, it is set to true. >>>>> Log of driver : >>>>> >>>>> 16/05/12 10:18:29 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: >>>>> SIGTERM >>>>> 16/05/12 10:18:29 INFO streaming.StreamingContext: Invoking >>>>> stop(stopGracefully=true) from shutdown hook >>>>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Stopping JobGenerator >>>>> gracefully >>>>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waiting for all received >>>>> blocks to be consumed for job generation >>>>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waited for all received >>>>> blocks to be consumed for job generation >>>>> >>>>> Log of executor: >>>>> 16/05/12 10:18:29 ERROR executor.CoarseGrainedExecutorBackend: Driver >>>>> xx.xx.xx.xx:xxxxx disassociated! Shutting down. >>>>> 16/05/12 10:18:29 WARN remote.ReliableDeliverySupervisor: Association >>>>> with remote system [xx.xx.xx.xx:xxxxx] has failed, address is now gated >>>>> for [5000] ms. Reason: [Disassociated] >>>>> 16/05/12 10:18:29 INFO storage.DiskBlockManager: Shutdown hook called >>>>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -------------> >>>>> 204 //This is value i am logging >>>>> 16/05/12 10:18:29 INFO util.ShutdownHookManager: Shutdown hook called >>>>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -------------> >>>>> 205 >>>>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -------------> >>>>> 206 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, May 12, 2016 at 11:45 AM Deepak Sharma <deepakmc...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi Rakesh >>>>>> Did you tried setting *spark.streaming.stopGracefullyOnShutdown to >>>>>> true *for your spark configuration instance? >>>>>> If not try this , and let us know if this helps. >>>>>> >>>>>> Thanks >>>>>> Deepak >>>>>> >>>>>> On Thu, May 12, 2016 at 11:42 AM, Rakesh H (Marketing Platform-BLR) < >>>>>> rakes...@flipkart.com> wrote: >>>>>> >>>>>>> Issue i am having is similar to the one mentioned here : >>>>>>> >>>>>>> http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn >>>>>>> >>>>>>> I am creating a rdd from sequence of 1 to 300 and creating streaming >>>>>>> RDD out of it. >>>>>>> >>>>>>> val rdd = ssc.sparkContext.parallelize(1 to 300) >>>>>>> val dstream = new ConstantInputDStream(ssc, rdd) >>>>>>> dstream.foreachRDD{ rdd => >>>>>>> rdd.foreach{ x => >>>>>>> log(x) >>>>>>> Thread.sleep(50) >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> >>>>>>> When i kill this job, i expect elements 1 to 300 to be logged before >>>>>>> shutting down. It is indeed the case when i run it locally. It wait for >>>>>>> the >>>>>>> job to finish before shutting down. >>>>>>> >>>>>>> But when i launch the job in custer with "yarn-cluster" mode, it >>>>>>> abruptly shuts down. >>>>>>> Executor prints following log >>>>>>> >>>>>>> ERROR executor.CoarseGrainedExecutorBackend: >>>>>>> Driver xx.xx.xx.xxx:yyyyy disassociated! Shutting down. >>>>>>> >>>>>>> and then it shuts down. It is not a graceful shutdown. >>>>>>> >>>>>>> Anybody knows how to do it in yarn ? >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks >>>>>> Deepak >>>>>> www.bigdatabig.com >>>>>> www.keosha.net >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Thanks >>>> Deepak >>>> www.bigdatabig.com >>>> www.keosha.net >>>> >>> >> >> >> -- >> Thanks >> Deepak >> www.bigdatabig.com >> www.keosha.net >> >