Re: [Lxc-users] On clean shutdown of Ubuntu 10.04 containers

Brian K. White Mon, 06 Dec 2010 09:38:44 -0800

On 12/6/2010 2:42 AM, Trent W. Buck wrote:

This post describes my attempts to get "clean" shutdown of Ubuntu 10.04
containers.  The goal here is that a "shutdown -h now" of the dom0
should not result in a potentially inconsistent domU postgres database,
cf. a naive lxc-stop.


As at Ubuntu 10.04 with lxc 0.7.2, lxc-start detects that a container
has halted by 1) seeing a reboot event in<container>/var/run/utmp; or
2) seeing<container>'s PID 1 terminate.

Ubuntu 10.04 simply REQUIRES /var/run to be a tmpfs; this is hard-coded
into mountall's (upstart's) /lib/init/fstab.  Without it, the most
immediate issue is that /var/run/ifstate isn't reaped on reboot, ifup(8)
thinks lo (at least) is already configured, and the boot process hangs
waiting for the network.

Unfortunately, lxc 0.7's utmp detect requires /var/run to NOT be a
tmpfs.  The shipped lxc-ubuntu script works around this by deleting the
ifstate file and not mounting a tmpfs on /var/run, but to me that is
simply waiting for something else to assume /var/run is empty.  It also
doesn't cope with a mountall upgrade rewriting /lib/init/fstab.

More or less by accident, I discovered that I can tell lxc-start that
the container is ready to halt by "crashing" upstart:

     container# kill -SEGV 1

Likewise I can spoof a ctrl-alt-delete event in the container with:

     dom0# pkill -INT lxc-start

I automate the former signalling at the end of shutdowns thusly:

     chroot $template_dir dpkg-divert --quiet --rename /sbin/reboot
     chroot $template_dir tee>/dev/null /sbin/reboot<<-EOF
        #!/bin/bash
        while getopts nwdfiph opt
        do [[ f = \$opt ]]&&  exec kill -SEGV 1
        done
        exec -a "$0" "\$0.distrib" "\$@"
        EOF
     chroot $template_dir chmod +x /sbin/reboot
     chroot $template_dir ln -s reboot.distrib /sbin/halt.distrib
     chroot $template_dir ln -s reboot.distrib /sbin/poweroff.distrib

I use the latter in my customized /etc/init.d/lxc stop rule.
Note that the lxc-wait's SHOULD be parallelized, but this is not
possible as at lxc 0.7.2 :-(


Sure it is.

I parallelize the shutdowns (in any version, including 0.7.2) by doingall the lxc-stop in parallel without looking or waiting, then in aseparate following step do a loop that waits for no containers running.


Here is my openSUSE init.d/lxc:
https://build.opensuse.org/package/files?package=lxc&project=home:aljex
And the packages:
http://download.opensuse.org/repositories/home:/aljex/*/lxc-0.7.2*.rpm

It makes assumptions that are wrong for ubuntu and is more limited thanyou may want in terms of what it even tries to handle. But that's besidethe point of parallel shutdowns.

* cgroup handling includes a particular stack of override logic forpossible cgroup mount points that makes sense to me.- start with built-in default /var/run/lxc/cgroup, and name it "lxc" soas not to conflict with any other cgroup setup by default.

- if you defined something in $LXC_CONF, prefer it over default

- if kernel is providing /sys/fs/cgroup automatically, prefer that overeither default or $LXC_CONF

- if a cgroup named "lxc" is already mounted, prefer that over all else

* assumes lxc 0.7.2 because the script is part of a lxc-0.7.2 rpm

- removes the shutdown/reboot watchdog functions that were needed in0.6.5 but are built in to 0.7.2 now.


* only starts containers that are defined by $LXC_ETC/*/config

* only shuts down containers that it started

* the stop function greps for /sbin/init in container inittab instead oftrying to allow for any random container pid #1

* no provision for application/service containers, just whole systemsstarted with /sbin/init


* starts containers in screen

- I have not figured out what it would take to get nice behavior out oflxc-console yet and screen is both easy and standard.

The $LXC_CONF (/etc/lxc/lxc.conf) referenced at the top does not existusually so everything that happens is visible right in the script.


I'm using this in production. So far so good.

typical usage:

nj10:~ # rclxc status

Checking for LXC containers...running

nj10:~ # rclxc list
Listing LXC containers...
'vps001' is RUNNING
'vps002' is RUNNING
'vps003' is RUNNING
'vps004' is RUNNING
'vps005' is RUNNING
'vps006' is RUNNING
'vps007' is RUNNING
'vps008' is RUNNING
'vps009' is RUNNING
'vps011' is RUNNING
'vps012' is RUNNING
'vps013' is RUNNING
nj10:~ # rclxc stop vps008

Shutting down LXC containers...done

nj10:~ # rclxc list
Listing LXC containers...
'vps001' is RUNNING
'vps002' is RUNNING
'vps003' is RUNNING
'vps004' is RUNNING
'vps005' is RUNNING
'vps006' is RUNNING
'vps007' is RUNNING
'vps008' is STOPPED
'vps009' is RUNNING
'vps011' is RUNNING
'vps012' is RUNNING
'vps013' is RUNNING
nj10:~ # rclxc status

Checking for LXC containers...running

nj10:~ # rclxc stop

Shutting down LXC containers...done

nj10:~ # rclxc status

Checking for LXC containers...unused

nj10:~ # rclxc list
Listing LXC containers...
'vps001' is STOPPED
'vps002' is STOPPED
'vps003' is STOPPED
'vps004' is STOPPED
'vps005' is STOPPED
'vps006' is STOPPED
'vps007' is STOPPED
'vps008' is STOPPED
'vps009' is STOPPED
'vps011' is STOPPED
'vps012' is STOPPED
'vps013' is STOPPED
nj10:~ # time rclxc start

Starting LXC containers...done


real    0m0.242s
user    0m0.012s
sys     0m0.000s
nj10:~ # rclxc list
Listing LXC containers...
'vps001' is RUNNING
'vps002' is RUNNING
'vps003' is RUNNING
'vps004' is RUNNING
'vps005' is RUNNING
'vps006' is RUNNING
'vps007' is RUNNING
'vps008' is RUNNING
'vps009' is RUNNING
'vps011' is RUNNING
'vps012' is RUNNING
'vps013' is RUNNING
nj10:~ # screen -r vps013

INIT: version 2.88 booting
INIT: Entering runlevel: 3
blogd: can not set console device to /dev/pts/34: Device or resource busy
Master Resource Control: previous runlevel: N, switching to runlevel:3
Initializing random number generator                                 done
Starting syslog services                                             done
Starting D-Bus daemon                                                done
No keyboard map to load
Loading compose table winkeys shiftctrl latin1.add                   done
Stop Unicode mode                                                    done
Setting up (localfs) network interfaces:
    lo
    lo        IP address: 127.0.0.1/8
              IP address: 127.0.0.2/8
    lo                                                               done
    eth0
    eth0      IP address: 71.187.206.90/24
    eth0                                                             done
Setting up service (localfs) network  .  .  .  .  .  .  .  .  .  .   done
Starting SSH daemon                                                  done
Loading CPUFreq modules (CPUFreq not supported)
Starting HAL daemon                                                  done
Setting up (remotefs) network interfaces:
Setting up service (remotefs) network  .  .  .  .  .  .  .  .  .  .  done
Re-Starting syslog services                                          done
Starting auditd The audit system is disabled
                                                                     done
Starting incron                                                      done
Starting mail service (Postfix)                                      done
Starting CRON daemon                                                 done
Starting rpcbind                                                     done
Starting rsync daemon                                                done
Starting smartd                                                      unused
Starting vsftpd                                                      done
Starting INET services. (xinetd)                                     done
Master Resource Control: runlevel 3 has been                         reached
Skipped services in runlevel 3:                            splash smartd

Welcome to openSUSE 11.3 "Teal" - Kernel 2.6.37-rc3-3-default (console).


nj10-013 login:

[detached]
nj10:~ # time rclxc stop

Shutting down LXC containers...done


real    0m8.537s
user    0m0.048s
sys     0m0.124s
nj10:~ # rclxc list
Listing LXC containers...
'vps001' is STOPPED
'vps002' is STOPPED
'vps003' is STOPPED
'vps004' is STOPPED
'vps005' is STOPPED
'vps006' is STOPPED
'vps007' is STOPPED
'vps008' is STOPPED
'vps009' is STOPPED
'vps011' is STOPPED
'vps012' is STOPPED
'vps013' is STOPPED
nj10:~ # screen -ls
No Sockets found in /var/run/screens/S-root.
nj10:~ # lxc-ps --lxc auxwww

CONTAINER USER PID %CPU %MEM VSZ RSS TTY STAT STARTTIME COMMAND

nj10:~ #


--
bkw

#!/bin/sh
# /etc/init.d/lxc
#   and its symbolic link
# /usr/sbin/rclxc
#
# System startup script for LXC containers.
# For lxc 0.7.2 which doesn't require an external monitor process to perform
# the lxc-stop when a containers init process requests init 0|1|6 .
#
# 20101108 - Brian K. White - br...@aljex.com
#
### BEGIN INIT INFO
# Provides:          lxc
# Required-Start:    $ALL
# Should-Start:
# Required-Stop:     $ALL
# Should-Stop:
# Default-Start:     3 5
# Default-Stop:      0 1 2 6
# Short-Description: LXC Linux Containers
# Description:       Start/Stop LXC containers.

### END INIT INFO

. /etc/rc.status

LXC_ETC=/etc/lxc
LXC_SRV=/srv/lxc
CGROUP_MOUNT_POINT=/var/run/lxc/cgroup
CGROUP_MOUNT_NAME=lxc
CGROUP_MOUNTED=false
CGROUP_RELEASE_AGENT="/usr/sbin/lxc_cgroup_release_agent"
LXC_CONF=${LXC_ETC}/lxc.conf
[[ -s $LXC_CONF ]] && . $LXC_CONF

# Various possible overrides to cgroup mount point.
# If kernel supplies cgroup mount point, prefer it.
[[ -d /sys/fs/cgroup ]] && CGROUP_MOUNT_POINT=/sys/fs/cgroup 
CGROUP_MOUNT_NAME=cgroup
# If cgroup already mounted, use it no matter where it is.
# If multiple cgroup mounts, prefer the one named lxc if any.
eval `awk 'BEGIN{P="";N=""}END{print("cgmp="P" 
cgmn="N)}($3=="cgroup"){N=$1;P=$2;if($1="lxc")exit}' /proc/mounts`
[[ "$cgmn" && "$cgmp" && -d "$cgmp" ]] && CGROUP_MOUNT_POINT=$cgmp 
CGROUP_MOUNT_NAME=$cgmn CGROUP_MOUNTED=true

lxcstrt () {
        $CGROUP_MOUNTED || {
                [[ -d $CGROUP_MOUNT_POINT ]] || mkdir -p $CGROUP_MOUNT_POINT
                mount -t cgroup $CGROUP_MOUNT_NAME $CGROUP_MOUNT_POINT
        }
        echo "$CGROUP_RELEASE_AGENT" >${CGROUP_MOUNT_POINT}/release_agent
        echo 1 >${CGROUP_MOUNT_POINT}/notify_on_release
        cd $LXC_ETC
        for CF in */config ; do
                CN=${CF%/*}
                [[ "${1:-$CN}" = "$CN" ]] || continue
                screen -dmS $CN lxc-start -f $CF -n $CN
        done
}

lxcstop () {
        typeset -i PID=0
        lxc-ps -C init -opid |while read CN PID ;do
                [[ $PID -gt 1 ]] || continue
                [[ "${1:-$CN}" = "$CN" ]] || continue
                grep -q 'p0::powerfail:/sbin/init 0' 
${LXC_SRV}/${CN}/etc/inittab || continue
                kill -PWR $PID
        done
}

lxcstat () {
        typeset -i R=0
        cd $LXC_ETC
        for CF in */config ; do
                CN="${CF%/*}"
                [[ "${1:-$CN}" = "$CN" ]] || continue
                S=`lxc-info -n $CN`
                echo "$S"
                [[ "${S##* }" = "RUNNING" ]] && ((R++))
        done
        [[ $R -gt 0 ]] && return 0 || return 3
}

rc_reset

case "$1" in
        start)
                echo -n "Starting LXC containers..."
                lxcstrt $2
                rc_status -v
                ;;
        stop)
                echo -n "Shutting down LXC containers..."
                lxcstop $2
                while $0 status $2 >/dev/null 2>&1 ; do sleep 2 ; done
                rc_status -v
                ;;
        try-restart)
                $0 status && $0 restart || rc_reset
                rc_status
                ;;
        restart)
                $0 stop $2
                $0 start $2
                rc_status
                ;;
        status)
                echo -n "Checking for LXC containers..."
                lxcstat $2 >/dev/null 2>&1
                rc_status -v
                ;;
        info|list|show)
                echo "Listing LXC containers..."
                lxcstat $2
                ;;
        *)
                echo "Usage: $0 {start|stop|try-restart|restart|status|list} 
[container_name]"
                exit 1
                ;;
esac

rc_exit

------------------------------------------------------------------------------
What happens now with your Lotus Notes apps - do you make another costly 
upgrade, or settle for being marooned without product support? Time to move
off Lotus Notes and onto the cloud with Force.com, apps are easier to build,
use, and manage than apps on traditional platforms. Sign up for the Lotus 
Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d

_______________________________________________
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users

Re: [Lxc-users] On clean shutdown of Ubuntu 10.04 containers

Reply via email to