On Wed Aug 13, 2025 at 11:02 AM CEST, Fabian Grünbichler wrote:
> On August 13, 2025 10:50 am, Max R. Carrara wrote:
> > On Wed Aug 13, 2025 at 9:52 AM CEST, Fabian Grünbichler wrote:
> >> On August 12, 2025 6:46 pm, Max R. Carrara wrote:
> >> > Introduce a new helper command pve-osd-lvm-enable-autoactivation,
> >> > which gracefully tries to enable autoactivation for all logical
> >> > volumes used by Ceph OSDs while also activating any LVs that aren't
> >> > active yet. Afterwards, the helper attempts to bring all OSDs online.
> >>
> >> I think this is probably overkill - this only affects a specific non
> >> standard setup, the breakage is really obvious, and the fix is easy:
> >> either run lvchange on all those LVs, or recreate the OSDs after the fix
> >> for creation is rolled out..
> >>
> >> i.e., the fallout from some edge cases not being handled correctly in
> >> the 200 line helper script here is probably worse than the few setups
> >> that run into the original issue that we can easily help along
> >> manually..
> > 
> > I mean, this script doesn't really do much, and the LVs themselves are
> > fetched via `ceph-volume` ... But then again, you're probably right that
> > it might just break somebody else's arcane setup somewhere.
> > 
> > As an alternative, I wouldn't mind writing something for the release
> > notes' known issues section (or some other place). Assuming a standard
> > setup*, all that the user would have to do is identical to what the
> > script does, so nothing too complicated.
>
> but the known issue will be gone, except for the handful of users that
> ran into it before the fix was rolled out.. this is not something you
> noticed 5 months later?

Are you sure, though? There might very well be setups out there that are
only rebooted every couple months or so--not everybody is as diligent
with their maintenance, unfortunately. We don't really know how common /
rare it is to set up a DB / WAL disk for OSDs.

>
> > (*OSDs with WAL + DB on disks / partitions without anything else inbetween)
>
> I am not worried about the script breaking things, but about it printing
> spurious errors/warnings for unaffected setups.

Well, unaffected here would mean that all OSD LVs have autoactivation
enabled (and are also activated). Additionally, if the OSDs are already
up, `ceph-volume` doesn't do anything either.

FWIW we could suppress LVM "No medium found" warnings in the LVM
commands the script runs, like we do in pve-storage [0]. Additionally,
we could also short-circuit and silently exit early if no changes are
necessary; e.g. if autoactivation is enabled for all OSD LVs, we can
assume that they're activated as well, and that the OSDs themselves are
therefore running, too.

So, I would really prefer having either (an improved version of) the
script, or some documentation regarding this *somewhere*, just to not
leave any users in the dark.

[0]: 
https://git.proxmox.com/?p=pve-storage.git;a=blob;f=src/PVE/Storage/LVMPlugin.pm;h=0416c9e02a1d8255c940d2cd9f5e0111b784fe7c;hb=refs/heads/master#l21

>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to