On 10/06/2025 17:03, Michael Köppl wrote:
> Tested this by setting up an LVM volume group on a shared LUN (via
> iSCSI) according to the Proxmox multipath docs [0]. Replicated the steps
> to reproduce given in Friedrich's comment in the bug ticket on a 2-node
> cluster and was able to reproduce the original problems. With the
> patches applied, the following scenario worked as I would expect it to:
> - Created VM with ID 100 on shared LVM on node 1
> - Restarted node 2
> - Deleted VM 100
> - Created VM 100 on node 2
> 
> Checked that:
> - Device mapper device is not created on node 2 (output of `dmsetup ls |
> grep 100` shows no results) during the reboot
> - Creation of new VM 100 on node 2 is successful
> 
> Additionally also checked the migration scenario on the same 2-node cluster:
> - Created VM with ID 100 on shared LVM on node 1
> - Rebooted node 2
> - Created backup of VM 100
> - Destroyed VM 100
> - Restored VM 100 from backup on node 1
> - Started VM 100
> - Live migrated VM 100 from node 1 to node 2
> 
> Checked that:
> - Again, device mapper device is not created on reboot of node 2
> - After restoring VM 100 on node 1 and starting it, the inactive LV
> exists on node 2
> - VM 100 is successfully live-migrated to node 2, the previously
> inactive LV is set to active
> 
> Also ran the pve8to9 script:
> - The script correctly detected that the LVM storage contained guest
> volumes with autoactivation enabled
> - After running `pve8to9 updatelvm`, lvs reported autoactivation
> disabled for all volumes on the node
> 
> Consider this:
> Tested-by: Michael Köppl <m.koe...@proxmox.com>
> 
> Also had a look at the changes to the best of my abilities, taking the
> previous discussions from v1 and v2 into account. Apart from one
> suggestion I added to the pve-manager 3/3 patch, the changes look good
> to me. So please also consider this:
> 
> Reviewed-by: Michael Köppl <m.koe...@proxmox.com>

Thanks for the test and the review!

Seeing that current pve-manager master now has a pve8to9 script, I'll
send a rebased v4, incorporating your comments on pve-manager 3/3.

One remaining question is where the `updatelvm` command should live. I
think pve8to9 sounds like a good place for such commands (that do
upgrade-related things that are too involved for a silent postinst). But
having to move the checklist to a `checklist` subcommand for that is a
bit weird. What do others think?

> 
> [0] https://pve.proxmox.com/wiki/Multipath
> 
> On 4/29/25 13:36, Friedrich Weber wrote:
>> # Summary
>>
>> With default settings, LVM autoactivates LVs when it sees a new VG, e.g. 
>> after
>> boot or iSCSI login. In a cluster with guest disks on a shared LVM VG (e.g. 
>> on
>> top of iSCSI/Fibre Channel (FC)/direct-attached SAS), this can indirectly 
>> cause
>> guest creation or migration to fail. See bug #4997 [1] and patch #2 for
>> details.
>>
>> The primary goal of this series is to avoid autoactivating thick LVs that 
>> hold
>> guest disks in order to fix #4997. For this, it patches the LVM storage 
>> plugin
>> to create new LVs with autoactivation disabled, and implements a pve8to9 
>> check
>> and subcommand to disable autoactivation on existing LVs (see below for
>> details).
>>
>> The series does the same for LVM-thin storages. While LVM-thin storages are
>> inherently local and cannot run into #4997, it can still make sense to avoid
>> unnecessarily activating thin LVs at boot.
>>
>> This series should only be applied for PVE 9, see below.
>>
>> Marked as RFC to get feedback on the general approach and some details, see
>> patches #3 and #6. In any case, this series shouldn't be merged as-is, as it
>> adds an incomplete stub implementation of pve8to9 (see below).
>>
>> # Mixed 7/8 cluster
>>
>> Unfortunately, we need to consider the mixed-version cluster between PVE 7 
>> and
>> PVE 8 because PVE 7/Bullseye's LVM does not know `--setautoactivation`. A 
>> user
>> upgrading from PVE 7 will temporarily have a mixed 7/8 cluster. Once this
>> series is applied, the PVE 8 nodes will create new LVs with
>> `--setautoactivation n`, which the PVE 7 nodes do not know. In my tests, the
>> PVE 7 nodes can read/interact with such LVs just fine, *but*: As soon as a 
>> PVE
>> 7 node creates a new (unrelated) LV, the `--setautoactivation n` flag is 
>> reset
>> to default `y` on *all* LVs of the VG. I presume this is because creating a 
>> new
>> LV rewrites metadata, and the PVE 7 LVM doesn't write out the
>> `--setautoactivation n` flag. I imagine (have not tested) this will cause
>> problems on a mixed cluster.
>>
>> Hence, as also discussed in v2, we should only apply this series for PVE 9, 
>> as
>> we can be sure all nodes are running at least PVE 8 then.
>>
>> # pve8to9 script
>>
>> As discussed in v2, this series implements
>>
>> (a) a pve8to9 check to detect thick and thin LVs with autoactivation enabled
>> (b) a script to disable autoactivation on LVs when needed, intended to be run
>>     manually by the user during 8->9 upgrade
>>
>> pve8to9 doesn't exist yet, so patch #4 adds a stub implementation to have a
>> basis for (a) and (b). We naturally don't have to go with this 
>> implementation,
>> I'm happy to rebase once pve8to9 exists.
>>
>> Patch #5 moves the existing checks from `pve8to9` to `pve8to9 checklist`, to 
>> be
>> able to implement (b) as a new subcommand. I realize this is a huge 
>> user-facing
>> change, and we don't have to go with this approach. It is also incomplete, as
>> patch #5 doesn't update the manpage. I included this to have a basis for the
>> next patch.
>>
>> Patch #6 implements a pve8to9 subcommand for (b), but this can be moved to a
>> dedicated script of course. Documentation for the new subcommand is missing.
>>
>> # Bonus fix for FC/SAS multipath+LVM issue
>>
>> As it turns out, this series seems to additionally fix an issue on hosts with
>> LVM on FC/SAS-attached LUNs *with multipath* where LVM would report "Device
>> mismatch detected" warnings because the LVs are activated too early in the 
>> boot
>> process before multipath is available. Our current suggested workaround is to
>> install multipath-tools-boot [2]. With this series applied and when users 
>> have
>> upgraded to 9, this shouldn't be necessary anymore, as LVs are not
>> auto-activated after boot.
>>
>> # Interaction with zfs-initramfs
>>
>> zfs-initramfs used to ship an an initramfs-tools script that unconditionally
>> activates *all* VGs that are visible at boot time, ignoring the 
>> autoactivation
>> flag. A fix was already applied in v2 [3].
>>
>> # Patch summary
>>
>> - Patch #1 is preparation
>> - Patch #2 makes the LVM plugin create new LVs with `--setautoactivation n`
>> - Patch #3 makes the LVM-thin plugin disable autoactivation for new LVs
>> - Patch #4 creates a stub pve8to9 script (see pve8to9 section above)
>> - Patch #5 moves pve8to9 checks to a subcommand (see pve8to9 section above)
>> - Patch #6 adds a pve8to9 subcommand to disable autoactivation
>>   (see pve8ot9 section above)
>>
>> # Changes since v2
>>
>> - drop zfsonlinux patch that was since applied
>> - add patches for LVM-thin
>> - add pve8to9 patches
>>
>> v2: 
>> https://lore.proxmox.com/pve-devel/20250307095245.65698-1-f.we...@proxmox.com/
>> v1: 
>> https://lore.proxmox.com/pve-devel/20240111150332.733635-1-f.we...@proxmox.com/
>>
>> [1] https://bugzilla.proxmox.com/show_bug.cgi?id=4997
>> [2] 
>> https://pve.proxmox.com/mediawiki/index.php?title=Multipath&oldid=12039#%22Device_mismatch_detected%22_warnings
>> [3] 
>> https://lore.proxmox.com/pve-devel/ad4c806c-234a-4949-885d-8bb369860...@proxmox.com/
>>
>> pve-storage:
>>
>> Friedrich Weber (3):
>>   lvm: create: use multiple lines for lvcreate command line
>>   fix #4997: lvm: create: disable autoactivation for new logical volumes
>>   lvmthin: disable autoactivation for new logical volumes
>>
>>  src/PVE/Storage/LVMPlugin.pm     | 10 +++++++++-
>>  src/PVE/Storage/LvmThinPlugin.pm | 18 +++++++++++++++++-
>>  2 files changed, 26 insertions(+), 2 deletions(-)
>>
>>
>> pve-manager stable-8:
>>
>> Friedrich Weber (3):
>>   cli: create pve8to9 script as a copy of pve7to8
>>   pve8to9: move checklist to dedicated subcommand
>>   pve8to9: detect and (if requested) disable LVM autoactivation
>>
>>  PVE/CLI/Makefile   |    1 +
>>  PVE/CLI/pve8to9.pm | 1695 ++++++++++++++++++++++++++++++++++++++++++++
>>  bin/Makefile       |   12 +-
>>  bin/pve8to9        |    8 +
>>  4 files changed, 1715 insertions(+), 1 deletion(-)
>>  create mode 100644 PVE/CLI/pve8to9.pm
>>  create mode 100755 bin/pve8to9
>>
>>
>> Summary over all repositories:
>>   6 files changed, 1741 insertions(+), 3 deletions(-)
>>
> 



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to