URL: <https://savannah.gnu.org/bugs/?60385>
Summary: GRUB fails to finish parsing LVM metadata when there's a dm-cache metadata lv with policy settings Project: GNU GRUB Submitted by: taoky Submitted on: Mon 12 Apr 2021 06:48:39 PM UTC Category: Booting Severity: Major Priority: 5 - Normal Item Group: Software Error Status: None Privacy: Public Assigned to: None Originator Name: Originator Email: Open/Closed: Open Release: Release: Git master Discussion Lock: Any Reproducibility: None Planned Release: None _______________________________________________________ Details: When there is a dm-cache metadata lv (described in lvmcache(7), "dm-cache with separate data and metadata LVs") in volume group, and the cache is set with cache policy (described in lvmcache(7), "dm-cache cache policy"), GRUB cannot parse any LV after the metadata lv. In grub-core/disk/lvm.c, grub_lvm_detect() will stop parsing lv metadata when the pointer `p` meets '}' at the beginning of the loop (while (1) on #L436). When it meets a segment with an unknown type, it will do nothing other than setting `skip_lv = 1`. and at the end of this loop, `p` will go to the next '}' char and plus 3 (to get to the next lv metadata). Obviously it assumes that there could be no '}' in the unknown segment. However, metadata cache policy breaks this assumption. Here is an example segment metadata taken from `vgcfgbackup`: ``` segment1 { start_extent = 0 extent_count = 10 # 40 Megabytes type = "cache-pool" data = "mcache_cdata" metadata = "mcache_cmeta" chunk_size = 2048 cache_mode = "writethrough" policy = "mq" policy_settings { migration_threshold=2048 random_threshold=4 } } ``` The `policy_settings` breaks GRUB from continuing parsing. At the end of the loop, `grub_strchr(p, '}')` takes p to the closing curly bracket of policy_settings (rather than segment1). When it is increased by 3, p pointers to the closing curly bracket of segment1. And at the next iteration, `p` points to a '}', and the loop ends. This bug can prevent system from booting at the worst case (Imagine a mirror lv with its 2 nodes' metadata after the cache pool, and it is what my production server really meets). In my case, the error message is "unknown node 'root_mimage_0'" when installing GRUB, making configuration files and booting. My patch and the reproduction script and data (run.sh, clean.sh and data) are in attached files. I have tested on Debian's stable and unstable GRUB package, and the latest git master of GRUB. This bug was originally posted at <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=985974>. And I think it's better to report here as I'm sure the bug exists in the upstream. _______________________________________________________ File Attachments: ------------------------------------------------------- Date: Mon 12 Apr 2021 06:48:39 PM UTC Name: 0001-lvm-fix-LVM-unknown-type-handling-bug.patch Size: 2KiB By: taoky <http://savannah.gnu.org/bugs/download.php?file_id=51253> ------------------------------------------------------- Date: Mon 12 Apr 2021 06:48:39 PM UTC Name: run.sh Size: 914B By: taoky <http://savannah.gnu.org/bugs/download.php?file_id=51254> ------------------------------------------------------- Date: Mon 12 Apr 2021 06:48:39 PM UTC Name: metadata Size: 5KiB By: taoky <http://savannah.gnu.org/bugs/download.php?file_id=51255> ------------------------------------------------------- Date: Mon 12 Apr 2021 06:48:39 PM UTC Name: clean.sh Size: 121B By: taoky <http://savannah.gnu.org/bugs/download.php?file_id=51256> _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?60385> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/