There are different reasons, but most commonly is when the framework ask to kill the task.
Can you provide some easy repro steps/artifacts? I've been working on Spark on Mesos these days and can help try this out. Tim On Mon, Dec 1, 2014 at 2:43 PM, Gerard Maas <[email protected]> wrote: > Hi, > > Sorry if this has been discussed before. I'm new to the list. > > We are currently running our Spark + Spark Streaming jobs on Mesos, > submitting our jobs through Marathon. > > We see with some regularity that the Spark Streaming driver gets killed by > Mesos and then restarted on some other node by Marathon. > > I've no clue why Mesos is killing the driver and looking at both the Mesos > and Spark logs didn't make me any wiser. > > On the Spark Streaming driver logs, I find this entry of Mesos "signing > off" my driver: > > Shutting down >> Sending SIGTERM to process tree at pid 17845 >> Killing the following process trees: >> [ >> -+- 17845 sh -c sh ./run-mesos.sh application-ts.conf >> \-+- 17846 sh ./run-mesos.sh application-ts.conf >> \--- 17847 java -cp core-compute-job.jar >> -Dconfig.file=application-ts.conf com.compute.job.FooJob 31326 >> ] >> Command terminated with signal Terminated (pid: 17845) > > > What would be the reasons for Mesos to kill an executor? > Have anybody seen something similar? Any hints on where to start digging? > > -kr, Gerard. > . > > > > > >

