Fix #6652: LVM Autoactivation Missing for Ceph OSD LVs - v2 ===========================================================
In short: When creating an OSD via the API, the logical volumes backing the OSD's DB and WAL do not have autoactivation enabled. Ceph requires autoactivation on LVs, as it otherwise never activates them directly itself. Fix this by setting autoactivation when creating those LVs as well as providing a helper script that enables autoactivation for them during an upgrade. Notable Changes --------------- - Add missing replacement for `LVMPlugin::lvcreate()` helper in OSD.pm (thanks Fabian!) - Drastically limit the scope of the helper script that runs during upgrades, most of which was discussed off-list with Fabian as well (thanks!). In particular: - No longer activate matched LVs - No longer try to bring up OSDs on the node - Limit enabling autoactivation to LVs which back an OSD WAL or OSD DB * previously, LVs that back OSD devices with type "block" were also matched - Limit the output of the helper script, most of which was also discussed off-list with Fabian (thanks again!) - The script is now completely silent unless an error is encountered - An invoked command (ceph-volume / lvs / lvchange) fails, the captured stderr of that command is dumped before exiting - Otherwise, stdout is processed (or swallowed) and stderr is suppressed in order to not worry any users with spurious / unwarranted LVM errors - Since pve-manager was bumped in the meantime, run the helper script when upgrading from a version < 9.0.6 instead of < 9.0.5 - Additionally require that the script runs when upgrading from a version >= 9.0~~ - In other words, the script is only called in postinst when upgrading from versions where 9.0~~ <= version < 9.0.6 Additional Notes ---------------- The changes to the helper script and when it is executed are made to reduce any potential side-effects that calling to LVM might have. If calling `lvs` and `lvchange --setautoactivation y` fails, then something's *really* wrong with the device anyway--and in that case, we also dump the captured stderr. Furthermore, if a node wasn't rebooted since new OSDs with DB/WAL were set up, enabling autoactivation is all that needs to be done. Even if the user rebooted the node before updating it, the update should enable autoactivation for the OSD LVs--after that, another reboot is sufficient to bring the OSDs back up. Alternatively, if one wants to avoid a reboot for whatever reason, the affected OSDs can be brought up again as follows: 1. Check for affected OSD LVs: # lvs --options lv_name,vg_name,autoactivation,active The names of the affected LVs should begin with "osd-wal" or "osd-db" followed by a UUID4. Example output: # lvs --options lv_name,vg_name,autoactivation,active LV VG AutoAct Active osd-db-2947e348-fe1b-4c38-b9d0-d24f3b8de70f ceph-1dce1129-a411-4a14-8508-edcc8626c594 osd-wal-cc31c0cf-2b40-4ea6-afc7-eea8b767f7f5 ceph-31a0a43c-990a-40dc-9027-6412b0f6673c osd-wal-20db00cf-b3c2-491a-bf17-4fd7c29aba6a ceph-534964a2-b764-4ed5-a3c2-fd21aeda116a osd-block-dd00fa96-695e-442c-99bd-ba09c2d3bd03 ceph-6bdd82ad-09bb-4c36-9382-1c302b462d7a enabled active osd-db-42648fa6-eb1f-43db-91d3-16ef6605c62b ceph-778869db-311b-475c-ba40-a61e531cc127 osd-block-3a178277-9ccc-488e-bfc5-4089a347195c ceph-e7e5ac3e-56d4-4fc2-b853-7ed2465d6a69 enabled active data pve enabled active root pve enabled active swap pve enabled active 2. For OSD LVs for which the "AutoAct" column is empty, run the following command: # lvchange --setautoactivation y <vg_name>/<lv_name> 3. For OSD LVs which aren't active, run the following command: # lvchange --activate y <vg_name>/<lv_name> 4. Double-check that autoactivation is set and that the affected LVs are activated: # lvs --options lv_name,vg_name,autoactivation,active Example output: # lvs --options lv_name,vg_name,autoactivation,active LV VG AutoAct Active osd-db-2947e348-fe1b-4c38-b9d0-d24f3b8de70f ceph-1dce1129-a411-4a14-8508-edcc8626c594 enabled active osd-wal-cc31c0cf-2b40-4ea6-afc7-eea8b767f7f5 ceph-31a0a43c-990a-40dc-9027-6412b0f6673c enabled active osd-wal-20db00cf-b3c2-491a-bf17-4fd7c29aba6a ceph-534964a2-b764-4ed5-a3c2-fd21aeda116a enabled active osd-block-dd00fa96-695e-442c-99bd-ba09c2d3bd03 ceph-6bdd82ad-09bb-4c36-9382-1c302b462d7a enabled active osd-db-42648fa6-eb1f-43db-91d3-16ef6605c62b ceph-778869db-311b-475c-ba40-a61e531cc127 enabled active osd-block-3a178277-9ccc-488e-bfc5-4089a347195c ceph-e7e5ac3e-56d4-4fc2-b853-7ed2465d6a69 enabled active data pve enabled active root pve enabled active swap pve enabled active 5. Finally, bring up all the OSDs again: # ceph-volume lvm activate --all Previous Versions ----------------- v1: https://lore.proxmox.com/pve-devel/20250812164631.428424-1-m.carr...@proxmox.com/T/ Summary of Changes ------------------ Max R. Carrara (2): fix #6652: ceph: osd: enable autoactivation for OSD LVs on creation fix #6652: d/postinst: enable autoactivation for Ceph OSD LVs PVE/API2/Ceph/OSD.pm | 27 +++- bin/Makefile | 3 +- bin/pve-osd-lvm-enable-autoactivation | 176 ++++++++++++++++++++++++++ debian/postinst | 16 +++ 4 files changed, 219 insertions(+), 3 deletions(-) create mode 100644 bin/pve-osd-lvm-enable-autoactivation -- 2.47.2 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel