[jira] [Commented] (MESOS-5893) mesos-executor terminated forked children turn to zombies

JIRA Mon, 25 Jul 2016 09:16:48 -0700

    [ 
https://issues.apache.org/jira/browse/MESOS-5893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392197#comment-15392197
 ]


Stéphane Cottin commented on MESOS-5893:
----------------------------------------

marathon-lb uses runit.

Dockerfile: https://github.com/mesosphere/marathon-lb/blob/master/Dockerfile

The entrypoint script : 
https://github.com/mesosphere/marathon-lb/blob/master/run
is launching runsv, which run 
https://github.com/mesosphere/marathon-lb/blob/master/service/haproxy/run

when SIGHUP is trapped, a haproxy hot reconfiguration appends.

from the haproxy documentation :
{quote}
2.4.1) Hot reconfiguration
--------------------------
The '-st' and '-sf' command line options are used to inform previously running
processes that a configuration is being reloaded. They will receive the SIGTTOU
signal to ask them to temporarily stop listening to the ports so that the new
process can grab them. If anything wrong happens, the new process will send
them a SIGTTIN to tell them to re-listen to the ports and continue their normal
work. Otherwise, it will either ask them to finish (-sf) their work then softly
exit, or immediately terminate (-st), breaking existing sessions. A typical use
of this allows a configuration reload without service interruption :

 # haproxy -p /var/run/haproxy.pid -sf $(cat /var/run/haproxy.pid)
{quote}

> mesos-executor terminated forked children turn to zombies
> ---------------------------------------------------------
>
>                 Key: MESOS-5893
>                 URL: https://issues.apache.org/jira/browse/MESOS-5893
>             Project: Mesos
>          Issue Type: Bug
>          Components: containerization
>    Affects Versions: 1.1.0
>         Environment: mesos compiled from git master ( 1.1.0 ) 
> {{../configure --enable-ssl --enable-libevent --prefix=/usr --enable-optimize 
> --enable-silent-rules --enable-xfs-disk-isolator}}
> isolators : 
> {{namespaces/pid,cgroups/cpu,cgroups/mem,filesystem/linux,docker/runtime,network/cni,docker/volume}}
>            Reporter: Stéphane Cottin
>              Labels: containerizer
>
> mesos containerizer does not properly handle children death.
> discovered using marathon-lb, each topology update fork another haproxy,  the 
> old haproxy process should properly die after its last client connection is 
> terminated, but turn into a zombie.
> {noformat}
>  7716 ?        Ssl    0:00  |       \_ mesos-executor 
> --launcher_dir=/usr/libexec/mesos --sandbox_directory=/mnt/mesos/sandbox 
> --user=root --working_directory=/marathon-lb 
> --rootfs=/mnt/mesos/provisioner/containers/3b381d5c-7490-4dcd-ab4b-81051226075a/backends/overlay/rootfses/a4beacac-2d7e-445b-80c8-a9b4e480c491
>  7813 ?        Ss     0:00  |       |   \_ sh -c /marathon-lb/run sse 
> --marathon https://marathon:8443 --auth-credentials user:pass --group 
> 'external' --ssl-certs /certs --max-serv-port-ip-per-task 20050
>  7823 ?        S      0:00  |       |   |   \_ /bin/bash /marathon-lb/run sse 
> --marathon https://marathon:8443 --auth-credentials user:pass --group 
> external --ssl-certs /certs --max-serv-port-ip-per-task 20050
>  7827 ?        S      0:00  |       |   |       \_ /usr/bin/runsv 
> /marathon-lb/service/haproxy
>  7829 ?        S      0:00  |       |   |       |   \_ /bin/bash ./run
>  8879 ?        S      0:00  |       |   |       |       \_ sleep 0.5
>  7828 ?        Sl     0:00  |       |   |       \_ python3 
> /marathon-lb/marathon_lb.py --syslog-socket /dev/null --haproxy-config 
> /marathon-lb/haproxy.cfg --ssl-certs /certs --command sv reload 
> /marathon-lb/service/haproxy --sse --marathon https://marathon:8443 
> --auth-credentials user:pass --group external --max-serv-port-ip-per-task 
> 20050
>  7906 ?        Zs     0:00  |       |   \_ [haproxy] <defunct>
>  8628 ?        Zs     0:00  |       |   \_ [haproxy] <defunct>
>  8722 ?        Ss     0:00  |       |   \_ haproxy -p /tmp/haproxy.pid -f 
> /marathon-lb/haproxy.cfg -D -sf 144 52
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-5893) mesos-executor terminated forked children turn to zombies

Reply via email to