Re: [systemd-devel] Delaying VM startup until block devices are available
On 27.01.2024 00:40, Orion Poplawski wrote: On 1/26/24 01:21, Lennart Poettering wrote: On Do, 25.01.24 16:28, Orion Poplawski (or...@nwra.com) wrote: We have various VMs that are back by luks encrypted LVs. At boot the volumes are decrypted by clevis. The problem we are seeing at the moment is that the VMs are started before the block devices are decrypted. Our current solution is: We generally wait for all devices listed in /etc/crypttab, unless you set noauto or nofail. We are setting 'nofail', because I don't think I want to fail the boot in general. They are not required for the system itself to function, just certain VMs. e.g: luks-backup /dev/vg_root/backup-raw none discard,_netdev,nofail See below for more though. # cat /etc/systemd/system/virtqemud.service.d/override.conf [Unit] After=blockdev@dev-mapper-luks\x2dbackup.target blockdev@dev-mapper-luks\x2dvm\x2d01\x2ddisk0.target Where we list each of the volumes to be decyrpted as blocking the virtqemud service. Does anyone have any better alternatives? My main issue it that it feels somewhere in between fine-grained and coarse-grained control. Ideally I think one would be able to have each individual VM startup automatically delayed until the devices each used became available, but I don't see how to do this. I am not sure how libvirt works, but if it runs every VM in a systemd unit, then you could just order the device before that unit, or the unit after the device. Really depends on how libvirt splits things up. I'm honestly not sure how libvirt works here either. But there seems to be this: # rpm -qf /usr/lib/systemd/system/virtqemud.service libvirt-daemon-driver-qemu-9.5.0-7.el9_3.alma.2.x86_64 which gets started: Jan 25 14:42:58 systemd[1]: Starting Virtualization qemu daemon... Jan 25 14:42:58 systemd[1]: Started Virtualization qemu daemon. Then the qemu-kvm processes end up in their own scope: ● machine-qemu\x2d1\x2dsrv\x2dmry01.scope - Virtual Machine qemu-1-srv-mry01 Loaded: loaded (/run/systemd/transient/machine-qemu\x2d1\x2dsrv\x2dmry01.scope; transient) Transient: yes Active: active (running) since Thu 2024-01-25 14:42:58 PST; 22h ago Tasks: 6 (limit: 16384) Memory: 15.6G CPU: 1h 15min 44.863s CGroup: /machine.slice/machine-qemu\x2d1\x2dsrv\x2dmry01.scope └─libvirt └─9086 /usr/libexec/qemu-kvm -name guest=... Alternatively it seems like one should be able to delay all VM startup until all volumes in /etc/crypttab were unlocked, rather than having to specify each one. But I don't see a target for that. This is default behaviour. Anything listed in /etc/crypttab is ordered before cryptsetup.target, which is ordered before sysinit.target, which is ordered before basic.target, which is ordered before regular services. We are specifying _netdev because they require the network to unlock. This I think puts them under remote-cryptsetup.target, and I used to depend on that. But with EL9 I'm seeing: # j -b -u remote-cryptsetup.target -u 'blockdev@dev-mapper-luks\x2dbackup.target' -u clevis-luks-askpass.service --no-hostname Jan 25 14:42:12 systemd[1]: Reached target Remote Encrypted Volumes. Jan 25 14:42:12 systemd[1]: Started Forward Password Requests to Clevis. Jan 25 14:42:48 clevis-luks-askpass[1706]: Unlocked /dev/vg_root/backup-raw (UUID=d6d25a85-2d43-4780-a312-e0e9b2383807) successfully Jan 25 14:42:54 systemd[1]: Reached target Block Device Preparation for /dev/mapper/luks-backup. Jan 25 14:42:59 systemd[1]: clevis-luks-askpass.service: Deactivated successfully. # systemctl list-dependencies remote-cryptsetup.target remote-cryptsetup.target ● ├─systemd-cryptsetup@luks\x2dbackup.service # j --no-hostname -b -u 'systemd-cryptsetup@luks\x2dbackup.service' Jan 25 14:42:12 systemd[1]: Starting Cryptography Setup for luks-backup... Jan 25 14:42:42 systemd-cryptsetup[1697]: Set cipher aes, mode xts-plain64, key size 512 bits for device /dev/vg_root/backup-raw. Jan 25 14:42:47 systemd-cryptsetup[1697]: Failed to activate with specified passphrase. (Passphrase incorrect?) Jan 25 14:42:48 systemd-cryptsetup[1697]: Set cipher aes, mode xts-plain64, key size 512 bits for device /dev/vg_root/backup-raw. Jan 25 14:42:54 systemd[1]: Finished Cryptography Setup for luks-backup. # systemctl show 'systemd-cryptsetup@luks\x2dbackup.service' | grep Type Type=oneshot So, if I'm following things correctly, this doesn't seem right. remote-cryptsetup.target depends on systemd-cryptsetup@luks\x2dbackup.service. This is a oneshot that is considered started after the main process exits, and above is shown as 14:42:54. But we are seeing 'Reached target Remote Encrypted Volumes' at 14:42:12. What am I missing? systemd-252-18.el9.x86_64 "nofail" encrypted devices are not ordered before (remote-)cryptsetup.target to not delay startup. The reasoning is, if you do not care whether this device exists or not, there is no reason to
Re: [systemd-devel] Delaying VM startup until block devices are available
On 1/26/24 01:21, Lennart Poettering wrote: > On Do, 25.01.24 16:28, Orion Poplawski (or...@nwra.com) wrote: > >> We have various VMs that are back by luks encrypted LVs. At boot the volumes >> are decrypted by clevis. The problem we are seeing at the moment is that the >> VMs are started before the block devices are decrypted. Our current >> solution is: > > We generally wait for all devices listed in /etc/crypttab, unless you > set noauto or nofail. We are setting 'nofail', because I don't think I want to fail the boot in general. They are not required for the system itself to function, just certain VMs. e.g: luks-backup /dev/vg_root/backup-raw none discard,_netdev,nofail See below for more though. >> # cat /etc/systemd/system/virtqemud.service.d/override.conf >> [Unit] >> After=blockdev@dev-mapper-luks\x2dbackup.target >> blockdev@dev-mapper-luks\x2dvm\x2d01\x2ddisk0.target >> >> Where we list each of the volumes to be decyrpted as blocking the virtqemud >> service. >> >> Does anyone have any better alternatives? My main issue it that it feels >> somewhere in between fine-grained and coarse-grained control. >> >> Ideally I think one would be able to have each individual VM startup >> automatically delayed until the devices each used became available, but I >> don't see how to do this. > > I am not sure how libvirt works, but if it runs every VM in a systemd > unit, then you could just order the device before that unit, or the > unit after the device. > > Really depends on how libvirt splits things up. I'm honestly not sure how libvirt works here either. But there seems to be this: # rpm -qf /usr/lib/systemd/system/virtqemud.service libvirt-daemon-driver-qemu-9.5.0-7.el9_3.alma.2.x86_64 which gets started: Jan 25 14:42:58 systemd[1]: Starting Virtualization qemu daemon... Jan 25 14:42:58 systemd[1]: Started Virtualization qemu daemon. Then the qemu-kvm processes end up in their own scope: ● machine-qemu\x2d1\x2dsrv\x2dmry01.scope - Virtual Machine qemu-1-srv-mry01 Loaded: loaded (/run/systemd/transient/machine-qemu\x2d1\x2dsrv\x2dmry01.scope; transient) Transient: yes Active: active (running) since Thu 2024-01-25 14:42:58 PST; 22h ago Tasks: 6 (limit: 16384) Memory: 15.6G CPU: 1h 15min 44.863s CGroup: /machine.slice/machine-qemu\x2d1\x2dsrv\x2dmry01.scope └─libvirt └─9086 /usr/libexec/qemu-kvm -name guest=... > >> Alternatively it seems like one should be able to delay all VM startup until >> all volumes in /etc/crypttab were unlocked, rather than having to specify >> each >> one. But I don't see a target for that. > > This is default behaviour. Anything listed in /etc/crypttab is ordered > before cryptsetup.target, which is ordered before sysinit.target, > which is ordered before basic.target, which is ordered before regular > services. We are specifying _netdev because they require the network to unlock. This I think puts them under remote-cryptsetup.target, and I used to depend on that. But with EL9 I'm seeing: # j -b -u remote-cryptsetup.target -u 'blockdev@dev-mapper-luks\x2dbackup.target' -u clevis-luks-askpass.service --no-hostname Jan 25 14:42:12 systemd[1]: Reached target Remote Encrypted Volumes. Jan 25 14:42:12 systemd[1]: Started Forward Password Requests to Clevis. Jan 25 14:42:48 clevis-luks-askpass[1706]: Unlocked /dev/vg_root/backup-raw (UUID=d6d25a85-2d43-4780-a312-e0e9b2383807) successfully Jan 25 14:42:54 systemd[1]: Reached target Block Device Preparation for /dev/mapper/luks-backup. Jan 25 14:42:59 systemd[1]: clevis-luks-askpass.service: Deactivated successfully. # systemctl list-dependencies remote-cryptsetup.target remote-cryptsetup.target ● ├─systemd-cryptsetup@luks\x2dbackup.service # j --no-hostname -b -u 'systemd-cryptsetup@luks\x2dbackup.service' Jan 25 14:42:12 systemd[1]: Starting Cryptography Setup for luks-backup... Jan 25 14:42:42 systemd-cryptsetup[1697]: Set cipher aes, mode xts-plain64, key size 512 bits for device /dev/vg_root/backup-raw. Jan 25 14:42:47 systemd-cryptsetup[1697]: Failed to activate with specified passphrase. (Passphrase incorrect?) Jan 25 14:42:48 systemd-cryptsetup[1697]: Set cipher aes, mode xts-plain64, key size 512 bits for device /dev/vg_root/backup-raw. Jan 25 14:42:54 systemd[1]: Finished Cryptography Setup for luks-backup. # systemctl show 'systemd-cryptsetup@luks\x2dbackup.service' | grep Type Type=oneshot So, if I'm following things correctly, this doesn't seem right. remote-cryptsetup.target depends on systemd-cryptsetup@luks\x2dbackup.service. This is a oneshot that is considered started after the main process exits, and above is shown as 14:42:54. But we are seeing 'Reached target Remote Encrypted Volumes' at 14:42:12. What am I missing? systemd-252-18.el9.x86_64 -- Orion Poplawski he/him/his - surely the least important thing about me Manager of IT Systems 720-772-5637 NWRA, Boulder/CoRA Office
Re: [systemd-devel] Delaying VM startup until block devices are available
On Do, 25.01.24 16:28, Orion Poplawski (or...@nwra.com) wrote: > We have various VMs that are back by luks encrypted LVs. At boot the volumes > are decrypted by clevis. The problem we are seeing at the moment is that the > VMs are started before the block devices are decrypted. Our current > solution is: We generally wait for all devices listed in /etc/crypttab, unless you set noauto or nofail. > > # cat /etc/systemd/system/virtqemud.service.d/override.conf > [Unit] > After=blockdev@dev-mapper-luks\x2dbackup.target > blockdev@dev-mapper-luks\x2dvm\x2d01\x2ddisk0.target > > Where we list each of the volumes to be decyrpted as blocking the virtqemud > service. > > Does anyone have any better alternatives? My main issue it that it feels > somewhere in between fine-grained and coarse-grained control. > > Ideally I think one would be able to have each individual VM startup > automatically delayed until the devices each used became available, but I > don't see how to do this. I am not sure how libvirt works, but if it runs every VM in a systemd unit, then you could just order the device before that unit, or the unit after the device. Really depends on how libvirt splits things up. > Alternatively it seems like one should be able to delay all VM startup until > all volumes in /etc/crypttab were unlocked, rather than having to specify each > one. But I don't see a target for that. This is default behaviour. Anything listed in /etc/crypttab is ordered before cryptsetup.target, which is ordered before sysinit.target, which is ordered before basic.target, which is ordered before regular services. Lennart -- Lennart Poettering, Berlin
Re: [systemd-devel] Delaying VM startup until block devices are available
On Fri, Jan 26, 2024 at 2:29 AM Orion Poplawski wrote: > > We have various VMs that are back by luks encrypted LVs. At boot the volumes > are decrypted by clevis. The problem we are seeing at the moment is that the > VMs are started before the block devices are decrypted. Our current solution > is: > > # cat /etc/systemd/system/virtqemud.service.d/override.conf > [Unit] > After=blockdev@dev-mapper-luks\x2dbackup.target > blockdev@dev-mapper-luks\x2dvm\x2d01\x2ddisk0.target > This only works if it is guaranteed that blockdev@xxx.target start job is already queued when virtqemud.service start is requested. In practice, systemd-cryptsetup is invoked early, before any "normal" service so it appears to work. But to be on safe side you probably need Wants=systemd-cryptsetup@backup.service or whatever service is used to decrypt device > Where we list each of the volumes to be decyrpted as blocking the virtqemud > service. > > Does anyone have any better alternatives? My main issue it that it feels > somewhere in between fine-grained and coarse-grained control. > > Ideally I think one would be able to have each individual VM startup > automatically delayed until the devices each used became available, but I > don't see how to do this. > Create a systemd generator that parses VM configuration(s) and adds those requirements on startup. > Alternatively it seems like one should be able to delay all VM startup until > all volumes in /etc/crypttab were unlocked, rather than having to specify each > one. But I don't see a target for that. > As long as all entries in /etc/crypttab are auto and are not nofali, they are ordered before /etc/crypttab which itself is ordered before sysinit.target. So any normal service should start only after all systemd-cryptsetup@xxx.service have completed. After=blockdev@... is more relevant for shutdown, to ensure applications requiring this block device will be shut down before systemd-cryptsetup@.service. I do not know how clevis hooks into all of this. Does it use systemd-cryptsetup@.service at all?
Re: [systemd-devel] Delaying VM startup until block devices are available
On Fri, Jan 26, 2024 at 1:29 AM Orion Poplawski wrote: > We have various VMs that are back by luks encrypted LVs. At boot the > volumes > are decrypted by clevis. The problem we are seeing at the moment is that > the > VMs are started before the block devices are decrypted. Our current > solution is: > > # cat /etc/systemd/system/virtqemud.service.d/override.conf > [Unit] > After=blockdev@dev-mapper-luks\x2dbackup.target > blockdev@dev-mapper-luks\x2dvm\x2d01\x2ddisk0.target > > Where we list each of the volumes to be decyrpted as blocking the virtqemud > service. > > Does anyone have any better alternatives? My main issue it that it feels > somewhere in between fine-grained and coarse-grained control. > > Ideally I think one would be able to have each individual VM startup > automatically delayed until the devices each used became available, but I > don't see how to do this. > You can't really do this with systemd if it's not systemd that does the startup... The libvirt daemons need to be patched to watch udev events and wait for the devices they require. > > Alternatively it seems like one should be able to delay all VM startup > until > all volumes in /etc/crypttab were unlocked, rather than having to specify > each > one. But I don't see a target for that. > If this were plain systemd-cryptsetup, you could add a drop-in for "systemd-cryptsetup@.service" that adds Before=foo.target. I'm not sure if clevis integrates with that. (Although honestly I don't see much point in using clevis for data volumes at all – just use it for the rootfs, and regular keyfiles in /etc/private for everything else...) -- Mantas Mikulėnas
[systemd-devel] Delaying VM startup until block devices are available
We have various VMs that are back by luks encrypted LVs. At boot the volumes are decrypted by clevis. The problem we are seeing at the moment is that the VMs are started before the block devices are decrypted. Our current solution is: # cat /etc/systemd/system/virtqemud.service.d/override.conf [Unit] After=blockdev@dev-mapper-luks\x2dbackup.target blockdev@dev-mapper-luks\x2dvm\x2d01\x2ddisk0.target Where we list each of the volumes to be decyrpted as blocking the virtqemud service. Does anyone have any better alternatives? My main issue it that it feels somewhere in between fine-grained and coarse-grained control. Ideally I think one would be able to have each individual VM startup automatically delayed until the devices each used became available, but I don't see how to do this. Alternatively it seems like one should be able to delay all VM startup until all volumes in /etc/crypttab were unlocked, rather than having to specify each one. But I don't see a target for that. Thank you for your consideration, Orion -- Orion Poplawski he/him/his - surely the least important thing about me Manager of IT Systems 720-772-5637 NWRA, Boulder/CoRA Office FAX: 303-415-9702 3380 Mitchell Lane or...@nwra.com Boulder, CO 80301 https://www.nwra.com/ smime.p7s Description: S/MIME Cryptographic Signature