[
https://issues.apache.org/jira/browse/MESOS-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096389#comment-15096389
]
Martin Bydzovsky commented on MESOS-4279:
-----------------------------------------
Just simple: NO, it doesn't work. :)
Now i created a completely fresh ububtu 14.04 server VM
installed vitrualbox machine from
http://releases.ubuntu.com/14.04.3/ubuntu-14.04.3-server-amd64.iso
Then followed step by step guide to compile and run mesos from the actual git
master branch (http://mesos.apache.org/gettingstarted/)
cloned the repo, did some apt-get install, git, openjdk, build-essentials,
python, libxxx, blabla... (as mentioned in the guide)
then i did
{code}
$ cd mesos
$ ./bootstrap
$ mkdir build
$ cd build
$ ../configure
$ make
{code}
after everything compiled OK, I started the master and the slave:
{code}
./mesos-master.sh --work_dir=/tmp --advertise_ip=192.168.59.4
—hostname=192.168.59.4
{code}
{code}
./mesos-slave.sh --master=192.168.59.4:5050 --containerizers=mesos,docker
--docker_stop_timeout=10secs --isolation=cgroups/cpu,cgroups/mem
--advertise_ip=192.168.59.4 --no-switch_user --hostname=192.168.59.4
{code}
then i started the "standalone" app - the previously mentioned python script in
/tmp/script.py as
{code:title=standalone.coffee}
request = require "request"
data =
args: ["/tmp/script.py"]
cpus: 0.1
mem: 256
instances: 1
id: "python-standalone"
request
method: "post"
url: "http://localhost:8080/v2/apps"
json: data
, (e, r, b) ->
console.log "err", e if e
{code}
with awesomely working grace restarts!
then i created python-docker app:
{code:title=docker.coffee}
request = require "request"
data =
args: ["./script.py"]
container:
type: "DOCKER"
docker:
image: "bydga/marathon-test-api"
cpus: 0.1
mem: 256
instances: 1
id: "python-docker"
request
method: "post"
url: "http://localhost:8080/v2/apps"
json: data
, (e, r, b) ->
console.log "err", e if e
{code}
and obviously - *NOT working*.
You can see that the state of the tasks differ - KILLED vs FINISHED. I would
expect finished in every case. http://prntscr.com/9pm28x
See attached screenshot of the stdout outputs of both tasks:
http://prntscr.com/9pm1ny
Here is also attached complete logs of master, slave and marathon run:
https://gist.github.com/bydga/f8e907d4c59bbcab726e
Im getting really desperate now - can someone confirm that is able to reproduce
the above?
> Graceful restart of docker task
> -------------------------------
>
> Key: MESOS-4279
> URL: https://issues.apache.org/jira/browse/MESOS-4279
> Project: Mesos
> Issue Type: Bug
> Components: containerization, docker
> Affects Versions: 0.25.0
> Reporter: Martin Bydzovsky
> Assignee: Qian Zhang
>
> I'm implementing a graceful restarts of our mesos-marathon-docker setup and I
> came to a following issue:
> (it was already discussed on
> https://github.com/mesosphere/marathon/issues/2876 and guys form mesosphere
> got to a point that its probably a docker containerizer problem...)
> To sum it up:
> When i deploy simple python script to all mesos-slaves:
> {code}
> #!/usr/bin/python
> from time import sleep
> import signal
> import sys
> import datetime
> def sigterm_handler(_signo, _stack_frame):
> print "got %i" % _signo
> print datetime.datetime.now().time()
> sys.stdout.flush()
> sleep(2)
> print datetime.datetime.now().time()
> print "ending"
> sys.stdout.flush()
> sys.exit(0)
> signal.signal(signal.SIGTERM, sigterm_handler)
> signal.signal(signal.SIGINT, sigterm_handler)
> try:
> print "Hello"
> i = 0
> while True:
> i += 1
> print datetime.datetime.now().time()
> print "Iteration #%i" % i
> sys.stdout.flush()
> sleep(1)
> finally:
> print "Goodbye"
> {code}
> and I run it through Marathon like
> {code:javascript}
> data = {
> args: ["/tmp/script.py"],
> instances: 1,
> cpus: 0.1,
> mem: 256,
> id: "marathon-test-api"
> }
> {code}
> During the app restart I get expected result - the task receives sigterm and
> dies peacefully (during my script-specified 2 seconds period)
> But when i wrap this python script in a docker:
> {code}
> FROM node:4.2
> RUN mkdir /app
> ADD . /app
> WORKDIR /app
> ENTRYPOINT []
> {code}
> and run appropriate application by Marathon:
> {code:javascript}
> data = {
> args: ["./script.py"],
> container: {
> type: "DOCKER",
> docker: {
> image: "bydga/marathon-test-api"
> },
> forcePullImage: yes
> },
> cpus: 0.1,
> mem: 256,
> instances: 1,
> id: "marathon-test-api"
> }
> {code}
> The task during restart (issued from marathon) dies immediately without
> having a chance to do any cleanup.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)