Dne 04. 06. 25 v 5:33 Chengen Du napsal(a):
Hi DM developers,

On Tue, May 13, 2025 at 10:35 PM Lennart Poettering
<lenn...@poettering.net> wrote:

On Di, 13.05.25 16:08, Chengen Du (chengen...@canonical.com) wrote:

Hi,

Apologies for including everyone in this message, but I’d like to bring
your attention to a fix [1], which may require your input.

As mentioned in my comments there: we can certainly enable the locking
stuff again for DM block devices too, but only if DM maintainers sign
off that this is OK. Hence ping the DM people about this, otherwise we
won't move on this.

To mitigate such issues, systemd-udevd normally acquires a LOCK_SH|LOCK_NB
using flock on the main block device before processing.
However, commit #e918a1b5a94f (udev: exclude device-mapper from block
device ownership event locking) disabled this behavior for device-mapper
devices, which appears to be the root cause of the boot hang with encrypted
swap.

iirc dm for some reason is allergic to us taking a bsd lock, because
they don't want us to hold an fd open while the udev rules run
(because bsd locking implies holding an fd open as long as the lock is
kept).

But only the DM people can shed some light on this. if they are fine
these days if we relax this then we can certainly cover their stuff
via the locking, too.

Apologies for reaching out again, but may I kindly ask for your input
on this issue?
Your assistance would be greatly appreciated to help move things forward.


Hi

We have overlooked the issue which seems to have origins most likely in the lost uevents due to switch from initramfs to rootfs and should be possibly addressed by a new socket flag.

But anyway let's looks at the current locking mechanism.

So for lvm2 to be able to 'deactivate' DM device - such device must NOT be opened - so taking a lock on an open descriptor to deactivate DM device is likely not going to work.

lvm2 however could be possibly enhanced to at least grab these bsd locks maybe when processing PV - that does not looks like a problematic part.

But adding bsd locks when processing DM (active LVs) looks like not so trivial task - there are DM devices which are 'private' to DM stack itself (i.e. cached raid LV - for a single public DM device - there might be tens of 'private' DM devices associated in a device tree - and for none of these devices lvm2 expects anyone using them - so any 'device stack tree' manipulation basically aborts when an unexpected user is there (public availability of these 'private' devices is however useful thing for various 'recovery/debugging' reasons - so there is very good reason all devices are present in users's /dev/ directory - but administrator should not blindly open them)

For protection against udev access to these private devices - were have originally used some uevent flags - those however were not 'permanent' as if udev was restarted with the clear database - all this info was lost (like one of the reason we asked in the past for this DM exception). Later on we added UUID -suffix solution - but this is not yet 'decorating' all device types - and although we now try to add them - it's not a simple task - so likely some nearby future version of lvm2 could be better - and in such a case - if this newer version of lvm2 would be in the system - and there would be no access to any device with UUID '-suffix' from udev tools chain - we can possibly reconsider this DM exception and see whether we can make it work somehow.

Yet - for locking itself - I'd probably see some usage of separate locking dir in /run as more usable approach - as the case where device needs to be 'removed/instantiated/....' cannot be 'lock protected' if the device itself must be held open.

But as a short term solution - we would rather need to see the actual exact problem which seems to be missing this locking - as is could be possibly something unrelated to this locking...


Regards

Zdenek

Reply via email to