Martin Bydzovsky created MESOS-4279:
---------------------------------------

             Summary: Graceful restart of docker task
                 Key: MESOS-4279
                 URL: https://issues.apache.org/jira/browse/MESOS-4279
             Project: Mesos
          Issue Type: Bug
          Components: containerization, docker
    Affects Versions: 0.25.0
            Reporter: Martin Bydzovsky


I'm implementing a graceful restarts of our mesos-marathon-docker setup and I 
came to a following issue:

(it was already discussed on https://github.com/mesosphere/marathon/issues/2876 
and guys form mesosphere got to a point that its probably docker containerizer 
problem...)
To sum it up:

When i deploy simple python script to all mesos-slaves:
{code}
#!/usr/bin/python

from time import sleep
import signal
import sys
import datetime

def sigterm_handler(_signo, _stack_frame):
    print "got %i" % _signo
    print datetime.datetime.now().time()
    sys.stdout.flush()
    sleep(2)
    print datetime.datetime.now().time()
    print "ending"
    sys.stdout.flush()
    sys.exit(0)

signal.signal(signal.SIGTERM, sigterm_handler)
signal.signal(signal.SIGINT, sigterm_handler)

try:
    print "Hello"
    i = 0
    while True:
        i += 1
        print datetime.datetime.now().time()
        print "Iteration #%i" % i
        sys.stdout.flush()
        sleep(1)
finally:
    print "Goodbye"
{code}

and I run it through Marathon like
{code:javascript}
data = {
        args: ["/tmp/script.py"],
        instances: 1,
        cpus: 0.1,
        mem: 256,
        id: "marathon-test-api"
}
{code}

During app restart I get expected result - task receives sigterm and dies 
peacefully (during my script-specified 2 seconds)

But when i wrap this python script in docker:
{code}
FROM node:4.2

RUN mkdir /app
ADD . /app
WORKDIR /app
ENTRYPOINT []
{code}
and run appropriate application by Marathon:
{code:javascript}
data = {
        args: ["./script.py"],
        container: {
                type: "DOCKER",
                docker: {
                        image: "bydga/marathon-test-api"
                },
                forcePullImage: yes
        },
        cpus: 0.1,
        mem: 256,
        instances: 1,
        id: "marathon-test-api"
}
{code}

The task during restart (issued from marathon) dies immediately without a 
chance to do any cleanup.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to