FYI the behavior of an update will have a similar outcome - tasks are subject to move when restarted in the course of an update.
On Saturday, June 25, 2016, Ziliang Chen <[email protected]> wrote: > Found the issue in the code, when doing update the job, i first did a > kill. Thanks Bill/Erb! > > On Sun, Jun 26, 2016 at 1:09 AM, Bill Farner <[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > >> Entering the KILLING state suggests that a user issued a kill command for >> the service. Does that sound plausible? >> >> >> On Saturday, June 25, 2016, Ziliang Chen <[email protected] >> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: >> >>> Instructed KILL. >>> >>> 4 minutes ago - KILLED : Instructed to kill task. >>> >>> - 06/25 22:32:23 LOCAL • PENDING >>> - 06/25 22:33:06 LOCAL • ASSIGNED >>> - 06/25 22:33:07 LOCAL • STARTING • Initializing sandbox. >>> - 06/25 22:33:09 LOCAL • RUNNING >>> - 06/25 22:42:15 LOCAL • KILLING • Killed by UNSECURE >>> - 06/25 22:42:18 LOCAL • KILLED • Instructed to kill task. >>> >>> >>> On Sat, Jun 25, 2016 at 9:55 PM, Erb, Stephan < >>> [email protected]> wrote: >>> >>>> When you go to the scheduler website, you should be able to expand the >>>> task event history of a terminated instance (by clicking on the + icon). >>>> What does it say there? >>>> >>>> >>>> >>>> *From: *Ziliang Chen <[email protected]> >>>> *Reply-To: *"[email protected]" <[email protected]> >>>> *Date: *Saturday 25 June 2016 at 15:08 >>>> *To: *"[email protected]" <[email protected]> >>>> *Subject: *Re: Prevent service Job moved from one machine to another >>>> periodically >>>> >>>> >>>> >>>> Hi Erb, >>>> >>>> >>>> >>>> As always, appreciate for your quick response! >>>> >>>> With your statements, I can understand Aurora's philosophy absolutely. >>>> But in my case, my service program is up and running there in good state, >>>> it seems that Aurora scheduler will kill my service program periodically >>>> and move it to another machine. I expect my service program running there >>>> forever unless there is a restart/crash etc. >>>> >>>> >>>> >>>> >>>> >>>> On Sat, Jun 25, 2016 at 8:27 PM, Erb, Stephan < >>>> [email protected]> wrote: >>>> >>>> Hi Zi-Liang, >>>> >>>> >>>> >>>> by default, services in Aurora are not pinned to a particular machine. >>>> This is based on the philosophy that services should be stateless and thus >>>> not dependent on a particular host, if possible. >>>> >>>> >>>> >>>> Whenever an instance/task of your service has terminated, the scheduler >>>> might pick any other random machine to launch a replacement. There are many >>>> reasons why this could happen: >>>> >>>> >>>> >>>> · Your instance has crashed, ran out of memory, or simply >>>> exited normally. >>>> >>>> · If enabled, your health checks may have detected that the >>>> instance is no longer responding. >>>> >>>> · The agent machine it was running on failed or lost >>>> connectivity with Mesos. >>>> >>>> · You have used the aurora_admin client to drain a machine. >>>> >>>> · You used a client command such as restart or update. >>>> >>>> >>>> >>>> If necessary, you could use constraints [1] to force Aurora to always >>>> schedule a service on the same host. However, this is not really >>>> recommended as it can easily lead to situations where your service cannot >>>> be launched at all, due to missing resources of he selected host in >>>> question. >>>> >>>> >>>> >>>> [1] >>>> https://github.com/apache/aurora/blob/master/docs/features/constraints.md >>>> >>>> >>>> >>>> Best regards, >>>> >>>> Stephan >>>> >>>> >>>> >>>> >>>> >>>> *From: *Ziliang Chen <[email protected]> >>>> *Reply-To: *"[email protected]" <[email protected]> >>>> *Date: *Saturday 25 June 2016 at 13:08 >>>> *To: *"[email protected]" <[email protected]> >>>> *Subject: *Prevent service Job moved from one machine to another >>>> periodically >>>> >>>> >>>> >>>> Hi, >>>> >>>> >>>> >>>> I have "service" job scheduled by Aurora. I found periodically, the >>>> service job will be moved from one machine to another (stop it on previous >>>> machine and restart it on another one). May i ask if this is an expected >>>> behavior and if it is, how to make the service job stick to one machine >>>> unless there is a failure ? >>>> >>>> >>>> >>>> Thank you very much ! >>>> >>>> >>>> >>>> -- >>>> >>>> Regards, Zi-Liang >>>> >>>> Mail:[email protected] >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Regards, Zi-Liang >>>> >>>> Mail:[email protected] >>>> >>>> >>> >>> >>> -- >>> Regards, Zi-Liang >>> >>> Mail:[email protected] >>> >> > > > -- > Regards, Zi-Liang > > Mail:[email protected] > <javascript:_e(%7B%7D,'cvml','Mail:[email protected]');> >
