FYI the behavior of an update will have a similar outcome - tasks are
subject to move when restarted in the course of an update.

On Saturday, June 25, 2016, Ziliang Chen <[email protected]> wrote:

> Found the issue in the code, when doing update the job, i first did a
> kill. Thanks Bill/Erb!
>
> On Sun, Jun 26, 2016 at 1:09 AM, Bill Farner <[email protected]
> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
>
>> Entering the KILLING state suggests that a user issued a kill command for
>> the service.  Does that sound plausible?
>>
>>
>> On Saturday, June 25, 2016, Ziliang Chen <[email protected]
>> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
>>
>>> Instructed KILL.
>>>
>>>  4 minutes ago - KILLED : Instructed to kill task.
>>>
>>>    - 06/25 22:32:23 LOCAL • PENDING
>>>    - 06/25 22:33:06 LOCAL • ASSIGNED
>>>    - 06/25 22:33:07 LOCAL • STARTING • Initializing sandbox.
>>>    - 06/25 22:33:09 LOCAL • RUNNING
>>>    - 06/25 22:42:15 LOCAL • KILLING • Killed by UNSECURE
>>>    - 06/25 22:42:18 LOCAL • KILLED • Instructed to kill task.
>>>
>>>
>>> On Sat, Jun 25, 2016 at 9:55 PM, Erb, Stephan <
>>> [email protected]> wrote:
>>>
>>>> When you go to the scheduler website, you should be able to expand the
>>>> task event history of a terminated instance (by clicking on the + icon).
>>>> What does it say there?
>>>>
>>>>
>>>>
>>>> *From: *Ziliang Chen <[email protected]>
>>>> *Reply-To: *"[email protected]" <[email protected]>
>>>> *Date: *Saturday 25 June 2016 at 15:08
>>>> *To: *"[email protected]" <[email protected]>
>>>> *Subject: *Re: Prevent service Job moved from one machine to another
>>>> periodically
>>>>
>>>>
>>>>
>>>> Hi Erb,
>>>>
>>>>
>>>>
>>>> As always, appreciate for your quick response!
>>>>
>>>> With your statements, I can understand Aurora's philosophy absolutely.
>>>> But in my case, my service program is up and running there in good state,
>>>> it seems that Aurora scheduler will kill my service program periodically
>>>> and move it to another machine. I expect my service program running there
>>>> forever unless there is a restart/crash etc.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, Jun 25, 2016 at 8:27 PM, Erb, Stephan <
>>>> [email protected]> wrote:
>>>>
>>>> Hi Zi-Liang,
>>>>
>>>>
>>>>
>>>> by default, services in Aurora are not pinned to a particular machine.
>>>> This is based on the philosophy that services should be stateless and thus
>>>> not dependent on a particular host, if possible.
>>>>
>>>>
>>>>
>>>> Whenever an instance/task of your service has terminated, the scheduler
>>>> might pick any other random machine to launch a replacement. There are many
>>>> reasons why this could happen:
>>>>
>>>>
>>>>
>>>> ·         Your instance has crashed, ran out of memory, or simply
>>>> exited normally.
>>>>
>>>> ·         If enabled, your health checks may have detected that the
>>>> instance is no longer responding.
>>>>
>>>> ·         The agent machine it was running on failed or lost
>>>> connectivity with Mesos.
>>>>
>>>> ·         You have used the aurora_admin client to drain a machine.
>>>>
>>>> ·         You used a client command such as restart or update.
>>>>
>>>>
>>>>
>>>> If necessary, you could use constraints [1] to force Aurora to always
>>>> schedule a service on the same host. However, this is not really
>>>> recommended as it can easily lead to situations where your service cannot
>>>> be launched at all, due to missing resources of he selected host in
>>>> question.
>>>>
>>>>
>>>>
>>>> [1]
>>>> https://github.com/apache/aurora/blob/master/docs/features/constraints.md
>>>>
>>>>
>>>>
>>>> Best regards,
>>>>
>>>> Stephan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From: *Ziliang Chen <[email protected]>
>>>> *Reply-To: *"[email protected]" <[email protected]>
>>>> *Date: *Saturday 25 June 2016 at 13:08
>>>> *To: *"[email protected]" <[email protected]>
>>>> *Subject: *Prevent service Job moved from one machine to another
>>>> periodically
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> I have "service" job scheduled by Aurora. I found periodically, the
>>>> service job will be moved from one machine to another (stop it on previous
>>>> machine and restart it on another one). May i ask if this is an expected
>>>> behavior and if it is, how to make the service job stick to one machine
>>>> unless there is a failure ?
>>>>
>>>>
>>>>
>>>> Thank you very much !
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Regards, Zi-Liang
>>>>
>>>> Mail:[email protected]
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Regards, Zi-Liang
>>>>
>>>> Mail:[email protected]
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards, Zi-Liang
>>>
>>> Mail:[email protected]
>>>
>>
>
>
> --
> Regards, Zi-Liang
>
> Mail:[email protected]
> <javascript:_e(%7B%7D,'cvml','Mail:[email protected]');>
>

Reply via email to