Package: am-utils
Version: 6.2+rc20110530-3~mi
Severity: important
Tags: patch
Attached is a patch to fix a problem causing the am-utils init script
to get stuck when stopping, which typically means that the machine in
question fails to shutdown or reboot.
The root cause of this problem seems to be improper ordering of the
actions required to bring down the automounter in a clean way.
In particular, what does /not/ work is this (from the original init
script):
> amq | awk '{print $1}' | tac | xargs -r -n 1 amq -u
> kill -s TERM "$pid"
> [ # wait 120 seconds for something to become umounted (pointlessly) ]
Here, we request a graceful unmount of all filesystems in question,
however, an instant later we send SIGTERM. Evidently amd does not have
time to fulfill those umount requests -- once having received SIGTERM,
it ceases to act on amq's unmount requests (the init script waits
120 seconds, anyway). Worse, after waiting that out, we finally reach:
> # Attempt to forcibly unmount the remaining filesystems
> [...]
> umount -l -f "$fs"
, at which point we (and hence, the shutdown) hang forever, (at least
for NFS volumes from what I can tell)
The attached patch fixed that issue by reordering the actions as follows:
1. Request (via amq -u) the umounting of all automounted filesystems
2. Wait (max N seconds) until everything has been umounted
3. /Then/ send SIGTERM
4. Wait (max M seconds) for the amd process to disappear
5. If it didn't disappear, we try our last-resort measures
i.e. umount -f followed by kill -9, like the original init script does
The attached patch reliably fixed those issues for our 150-ish debian
office computers, where it's in production as of roughly 2 months ago.
PS: Besides the main issue, two minor bugs were addressed in one go:
- hide the ``kill: no such process'' message on the console we get
when probing for amd's existence with kill -s 0
- remove ``returncode'' variable which didn't work anyway because it
gets assigned to in a subshell
On a last note to avoid confusion: We are using the wheezy-version
of am-utils (6.2) on Debian squeeze, at the moment. This is due to an unrelated
issue with am-utils 6.1.5. The init scripts of both versions are near
identical, anyway. (The local version suffix '~mi' in the 'Version'-header
is due to us having applied the proposed fix to the init script already.)
Best Regards,
Timo Buhrmester,
Bonn University, Germany
-- System Information:
Debian Release: 6.0.8
APT prefers oldstable
APT policy: (500, 'oldstable')
Architecture: i386 (i686)
Kernel: Linux 3.2.51.wap (SMP w/4 CPU cores)
Locale: LANG=en_US.iso885915, LC_CTYPE=en_US.iso885915 (charmap=ISO-8859-15)
Shell: /bin/sh linked to /bin/dash
Versions of packages am-utils depends on:
ii debconf 1.5.36.1 Debian configuration management sy
ii debianutils 3.4 Miscellaneous utilities specific t
ii libamu4 6.2+rc20110530-3~mi Support library for amd the 4.4BSD
ii libc6 2.11.3-4 Embedded GNU C Library: Shared lib
ii libgdbm3 1.8.3-9 GNU dbm database routines (runtime
ii libhesiod0 3.0.2-20 Project Athena's DNS-based directo
ii libldap-2.4-2 2.4.23-7.3 OpenLDAP libraries
ii libwrap0 7.6.q-19 Wietse Venema's TCP wrappers libra
ii portmap 6.0.0-2 RPC port mapper
ii ucf 3.0025+nmu1 Update Configuration File: preserv
am-utils recommends no packages.
Versions of packages am-utils suggests:
pn am-utils-doc <none> (no description available)
pn nis <none> (no description available)
-- Configuration Files:
/etc/am-utils/amd.conf changed:
[global]
auto_dir = /amd
log_file = syslog
log_options = all,noinfo,nostats,nomap
restart_mounts = yes
selectors_in_defaults = yes
unmount_on_exit = yes
vendor = Debian
map_type = ldap
ldap_base = "ou=amd maps,dc=math,dc=uni-bonn,dc=de"
ldap_hostports = ldap.math.uni-bonn.de:389
ldap_proto_version = 3
[/home]
map_name = home
[/import/www]
map_name = www
[/import/home.ag]
map_name = home.ag
[/import/home.sek]
map_name = home.sek
-- debconf information:
am-utils/import-amd-failed:
am-utils/nis-master-map: amd.master
am-utils/clustername:
am-utils/nis-key: default
am-utils/rpc-localhost:
am-utils/map-others:
am-utils/nis-master-map-key-style: onekey
* am-utils/map-net: false
am-utils/use-nis: false
am-utils/nis-custom: echo "/amd-is-misconfigured /usr/share/am-utils/amd.net"
am-utils/import-amd-conf-done: false
am-utils/import-amd-conf: false
am-utils/map-home: false
am-utils/log-to-file:
--- debian/am-utils.init 2013-11-04 10:18:51.000000000 +0100
+++ /etc/init.d/am-utils 2013-09-13 15:18:13.000000000 +0200
@@ -121,33 +121,24 @@
# This function tries to kill an amd which hasn't exited nicely.
# This happens really easily, especially if using mount_type autofs
- # Get the currently mounted filesystems
- filesystems=`/bin/tempfile -s .amd`
- if [ $? -ne 0 ]; then
- return 1
- fi
-
- amq | awk '$2 !~ "root" {print $1}' > $filesystems
+ # Attempt to forcibly unmount the remaining filesystems
+ amq | awk '$2 !~ "root" {print $1}' | tac | while read fs; do
+ if [ -d "$fs" ]; then
+ sleep 1
+ umount -l -f "$fs"
+ fi
+ done
# Kill the daemon
kill -9 "$pid"
sleep 10
- if kill -s 0 $pid; then
- rm -f $filesystems
+ if kill -s 0 $pid 2>/dev/null; then
return 1
fi
- # Attempt to forcibly unmount the remaining filesystems
- returncode=0
- tac $filesystems | while read fs; do
- umount -l -f "$fs" || returncode=1
- sleep 1
- done
-
- rm -f $filesystems
- return $returncode
+ return 0
}
stop_amd() {
@@ -161,9 +152,33 @@
return
fi
fi
- echo -n "Requesting amd unmount filesystems: "
- amq | awk '{print $1}' | tac | xargs -r -n 1 amq -u
- echo " done."
+ echo -n "Requesting and waiting for amd unmount filesystems "
+
+ i=0
+ maxsecs=20
+
+ while [ $i -le $maxsecs ]; do
+ echo -n "."
+
+ empty=true
+ for f in $(amq | tac | awk '$2 != "root" && $2 != "toplvl" {print
$1}'); do
+ empty=false
+ amq -u $f
+ done
+
+ if $empty; then break; fi
+
+ sleep 1
+ i="`expr $i + 1`"
+ [ $i -eq 15 ] && echo -n " [will wait `expr $maxsecs - $i` secs more] "
+ done
+
+ if $empty; then
+ echo " done."
+ else
+ echo " failed."
+ fi
+
echo -n "Stopping automounter: amd "
kill -s TERM "$pid"
# wait until amd has finished; this may take a little bit, and amd can't