Re: [systemd-devel] removing dm-crypt mapping hangs on shutdown
On Tue, 12.06.12 20:50, Dave Reisner (d...@falconindy.com) wrote: Hi, Heya, I'm having some problems with a few of my VMs which all share in common an encrypted root. They pivot into a shutdown ramfs and walk through the blockdevs in sysfs, breaking down and unmounting each device. Universally, these VMs all hang when calling 'cryptsetup remove', and only under systemd. The relevant bits of stack trace look something like this: ioctl(3, DM_DEV_REMOVE, 0xe1b8b0) = 0 semget(0xd4d255f, 1, 0) = 229376 semctl(229376, 0, GETVAL, 0x) = 2 semop(229376, {{0, -1, IPC_NOWAIT}}, 1) = 0 semop(229376, {{0, 0, 0}}, 1 Well, the DM libraries for some weird reason find it funny to use SysV IPC for intra-thread locking. I think it's quite a bad choice to use SysV IPC these days. (And it also uses threads in the first place, where it better shouldn't). Anyway, most likely you have more than one thread running there and your fg thread is waiting for the other one. I've found 2 solutions so far to avoiding this hang: 1) Not letting systemd get its paws on my block devices to umount them. Specifically, avoiding the dm_detach_all() call seems to alleviate this. Well, devices needed for the root fs should stay unnefected by dm_detach_all() [as DM_DEV_REMOVE should return EBUSY in that case] and the other ones should have been shutdown before we actually enter this code. If that doesn't happen for you there is something wrong with the destruction of LUKS media. 2) Removing the mlockall() call in src/core/shutdown.c on line 351 (according to git master as of this posting). Uh? not sure what that should change? So this, to me, raises 2 questions: 1) Can systemd just leave block devices which it didn't assemble alone? I'd expect that if the initramfs assembled and mounted a device, it should be responsible for tearing it down. In theory, this could just be something as simple as just skipping over / and /usr. We generally don't try to work around problems, we fix them where they are. The shutdown tool is mostly just intended as last fallback if the normal code didn't have the desired effect. It should leave the root fs untouched... 2) What's the goal of calling mlockall() here? The original patch that added this (b1b2a107d15a) gives no indication of why it's wanted/needed. It's merely an optimization to avoid that the shutdown process is swapped out. Lennart -- Lennart Poettering - Red Hat, Inc. ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] removing dm-crypt mapping hangs on shutdown
On Wed, 13.06.12 14:58, Colin Guthrie (gm...@colin.guthr.ie) wrote: I guess the initramfs should really write something in /run/initramfs folder that indicates which devices it's managing such that systemd can only deal with the remainder. Well, the idea is that the normal cryptsetup logic detaches all devies as part of normal shutdown, i.e. via cryptsetup@...service, way before systemd-shutdown becomes PID 1. At that point systemd-shutdown will just clean up what was forgotten before, and jump back into the initrd which is then responsible for the rootfs. The kernel reports EBUSY when a DM device is still needed for the root fs, so this is all flag we need. To me it appears as if there are two things to fix: a) for some reason the cryptsetup@...service shutdown logic didn't work for you for non-root-fs disks. b) libdevmapper sometimes hangs if somebody else invoked the kernel DM ioctls directly. a) of course doesn't apply if you talk about encrypted root. b) is something to fix between the kernel and libdevampper, not systemd. Lennart -- Lennart Poettering - Red Hat, Inc. ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] removing dm-crypt mapping hangs on shutdown
On Wed, 13.06.12 10:36, Dave Reisner (d...@falconindy.com) wrote: I don't think a simply solution would work here as there are other cases that may trigger the initramfs to enable certain devices (e.g. resume from an encrypted swap partition for example). Sure, there's already code to parse /proc/self/mountinfo and /proc/swaps. I don't know offhand how early this runs, but there's an opportunity to do some accounting here and mark already active devices as off limits for disassembly/unmount on shutdown. Nope, we shouldn't design our systems that fragile. If some resource is still needed it should make sure to return EBUSY and refuse destruction. Alternatively, rather than using a brute force approach, use something a little smarter which involves the holders attribute of any given block device in /sys/class/block. You can easily recurse down the chain of child devices and disassemble as the stack unwinds. In shell script, it looks something like this: Nope. systemd-shutdown is just the last resort for stuff that hasn't been shutdown cleanly otherwise. it is supposed to be brute force. Lennart -- Lennart Poettering - Red Hat, Inc. ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] removing dm-crypt mapping hangs on shutdown
On Tue, Jun 12, 2012 at 08:50:46PM -0400, Dave Reisner wrote: Hi, I'm having some problems with a few of my VMs which all share in common an encrypted root. They pivot into a shutdown ramfs and walk through the blockdevs in sysfs, breaking down and unmounting each device. Universally, these VMs all hang when calling 'cryptsetup remove', and only under systemd. The relevant bits of stack trace look something like this: ioctl(3, DM_DEV_REMOVE, 0xe1b8b0) = 0 semget(0xd4d255f, 1, 0) = 229376 semctl(229376, 0, GETVAL, 0x) = 2 semop(229376, {{0, -1, IPC_NOWAIT}}, 1) = 0 semop(229376, {{0, 0, 0}}, 1 I've found 2 solutions so far to avoiding this hang: 1) Not letting systemd get its paws on my block devices to umount them. Specifically, avoiding the dm_detach_all() call seems to alleviate this. 2) Removing the mlockall() call in src/core/shutdown.c on line 351 (according to git master as of this posting). Hmm, so as mysteriously as I thought this fixed my problem, it doesn't. It does significantly lower the occurrence of the hangs, though. It's more like a 1:10 ratio rather than a 2:3. Regardless, I'd be more interested in finding a solution that involves systemd not touching devices that it didn't assemble/mount. So this, to me, raises 2 questions: 1) Can systemd just leave block devices which it didn't assemble alone? I'd expect that if the initramfs assembled and mounted a device, it should be responsible for tearing it down. In theory, this could just be something as simple as just skipping over / and /usr. 2) What's the goal of calling mlockall() here? The original patch that added this (b1b2a107d15a) gives no indication of why it's wanted/needed. Cheers, Dave ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] removing dm-crypt mapping hangs on shutdown
'Twas brillig, and Dave Reisner at 13/06/12 01:50 did gyre and gimble: Hi, I'm having some problems with a few of my VMs which all share in common an encrypted root. They pivot into a shutdown ramfs and walk through the blockdevs in sysfs, breaking down and unmounting each device. Universally, these VMs all hang when calling 'cryptsetup remove', and only under systemd. The relevant bits of stack trace look something like this: ioctl(3, DM_DEV_REMOVE, 0xe1b8b0) = 0 semget(0xd4d255f, 1, 0) = 229376 semctl(229376, 0, GETVAL, 0x) = 2 semop(229376, {{0, -1, IPC_NOWAIT}}, 1) = 0 semop(229376, {{0, 0, 0}}, 1 I've seen similar symptoms in bug reports to your experience. Basically hanging after detaching dm devices. I've found 2 solutions so far to avoiding this hang: 1) Not letting systemd get its paws on my block devices to umount them. Specifically, avoiding the dm_detach_all() call seems to alleviate this. Yeah, it wouldn't be right to avoid it completely, but I agree that it does need to skip things a bit more robustly. So this, to me, raises 2 questions: 1) Can systemd just leave block devices which it didn't assemble alone? I'd expect that if the initramfs assembled and mounted a device, it should be responsible for tearing it down. In theory, this could just be something as simple as just skipping over / and /usr. I don't think a simply solution would work here as there are other cases that may trigger the initramfs to enable certain devices (e.g. resume from an encrypted swap partition for example). I guess the initramfs should really write something in /run/initramfs folder that indicates which devices it's managing such that systemd can only deal with the remainder. I'm not sure what the current status is, but I wonder what would happen when systemd tries to unmount /usr just now... :s 2) What's the goal of calling mlockall() here? The original patch that added this (b1b2a107d15a) gives no indication of why it's wanted/needed. No idea on this one. Col -- Colin Guthrie gmane(at)colin.guthr.ie http://colin.guthr.ie/ Day Job: Tribalogic Limited http://www.tribalogic.net/ Open Source: Mageia Contributor http://www.mageia.org/ PulseAudio Hacker http://www.pulseaudio.org/ Trac Hacker http://trac.edgewall.org/ ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] removing dm-crypt mapping hangs on shutdown
On Wed, Jun 13, 2012 at 02:58:10PM +0100, Colin Guthrie wrote: 'Twas brillig, and Dave Reisner at 13/06/12 01:50 did gyre and gimble: Hi, I'm having some problems with a few of my VMs which all share in common an encrypted root. They pivot into a shutdown ramfs and walk through the blockdevs in sysfs, breaking down and unmounting each device. Universally, these VMs all hang when calling 'cryptsetup remove', and only under systemd. The relevant bits of stack trace look something like this: ioctl(3, DM_DEV_REMOVE, 0xe1b8b0) = 0 semget(0xd4d255f, 1, 0) = 229376 semctl(229376, 0, GETVAL, 0x) = 2 semop(229376, {{0, -1, IPC_NOWAIT}}, 1) = 0 semop(229376, {{0, 0, 0}}, 1 I've seen similar symptoms in bug reports to your experience. Basically hanging after detaching dm devices. For whatever reason, I've only started seeing this recently, but I'm not surprised that others are seeing it as well. I've found 2 solutions so far to avoiding this hang: 1) Not letting systemd get its paws on my block devices to umount them. Specifically, avoiding the dm_detach_all() call seems to alleviate this. Yeah, it wouldn't be right to avoid it completely, but I agree that it does need to skip things a bit more robustly. So this, to me, raises 2 questions: 1) Can systemd just leave block devices which it didn't assemble alone? I'd expect that if the initramfs assembled and mounted a device, it should be responsible for tearing it down. In theory, this could just be something as simple as just skipping over / and /usr. I don't think a simply solution would work here as there are other cases that may trigger the initramfs to enable certain devices (e.g. resume from an encrypted swap partition for example). Sure, there's already code to parse /proc/self/mountinfo and /proc/swaps. I don't know offhand how early this runs, but there's an opportunity to do some accounting here and mark already active devices as off limits for disassembly/unmount on shutdown. Alternatively, rather than using a brute force approach, use something a little smarter which involves the holders attribute of any given block device in /sys/class/block. You can easily recurse down the chain of child devices and disassemble as the stack unwinds. In shell script, it looks something like this: http://projects.archlinux.org/mkinitcpio.git/tree/shutdown#n44 I guess the initramfs should really write something in /run/initramfs folder that indicates which devices it's managing such that systemd can only deal with the remainder. I'm not sure what the current status is, but I wonder what would happen when systemd tries to unmount /usr just now... :s Nothing interesting, just an EBUSY. Sending SIGTERM to remaining processes... Sending SIGKILL to remaining processes... Unmounting file systems. Unmounted /dev/mqueue. Unmounted /dev/hugepages. Unmounted /sys/kernel/debug. Could not unmount /usr: Device or resource busy Disabling swaps. Detaching loop devices. Detaching DM devices. Successfully changed into root pivot. Detaching loop devices. Unmounting all devices. Disassembling stacked devices. [ 28.580318] Power down. 2) What's the goal of calling mlockall() here? The original patch that added this (b1b2a107d15a) gives no indication of why it's wanted/needed. No idea on this one. Col -- Colin Guthrie gmane(at)colin.guthr.ie http://colin.guthr.ie/ Day Job: Tribalogic Limited http://www.tribalogic.net/ Open Source: Mageia Contributor http://www.mageia.org/ PulseAudio Hacker http://www.pulseaudio.org/ Trac Hacker http://trac.edgewall.org/ ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] removing dm-crypt mapping hangs on shutdown
Hi, I'm having some problems with a few of my VMs which all share in common an encrypted root. They pivot into a shutdown ramfs and walk through the blockdevs in sysfs, breaking down and unmounting each device. Universally, these VMs all hang when calling 'cryptsetup remove', and only under systemd. The relevant bits of stack trace look something like this: ioctl(3, DM_DEV_REMOVE, 0xe1b8b0) = 0 semget(0xd4d255f, 1, 0) = 229376 semctl(229376, 0, GETVAL, 0x) = 2 semop(229376, {{0, -1, IPC_NOWAIT}}, 1) = 0 semop(229376, {{0, 0, 0}}, 1 I've found 2 solutions so far to avoiding this hang: 1) Not letting systemd get its paws on my block devices to umount them. Specifically, avoiding the dm_detach_all() call seems to alleviate this. 2) Removing the mlockall() call in src/core/shutdown.c on line 351 (according to git master as of this posting). So this, to me, raises 2 questions: 1) Can systemd just leave block devices which it didn't assemble alone? I'd expect that if the initramfs assembled and mounted a device, it should be responsible for tearing it down. In theory, this could just be something as simple as just skipping over / and /usr. 2) What's the goal of calling mlockall() here? The original patch that added this (b1b2a107d15a) gives no indication of why it's wanted/needed. Cheers, Dave pgpAasLDDhwAH.pgp Description: PGP signature ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel