Bug#996915: raspi-firmware: hook breaks label-based booting

2024-06-05 Thread lymkwi

Hi,

    This issue is still ongoing on systems that rely on 
`rapi-firmware`. It appeared around a year ago on one particular 
Raspberry Pi 4 board I own for which the microSD card containing the 
rootfs partition competes with a USB device when enumerated, leading to 
inconsistent symbolic device names, and while I solved the resulting 
boot-time hang several months ago, a new kernel update brought it back.


    While I personally see no inherent problem with the 
`z50-raspi-firmware` post-install hook modifying kernel arguments (the 
way `grub-mkconfig` would on a system with the grub bootloader, for 
example), and especially because you can override the `$ROOTPART` 
portion of Linux arguments, the hook should absolutely *not* use paths 
to the block device's path. Even RaspberryPi images for Debian [0] use 
labels, as mentioned by Cyril two and a half years ago. What's even more 
baffling is that the post-install hook involved correctly derives the 
`$ROOTPART` variable for ZFS and btrfs-type root filesystems.


    `findmnt` is perfectly capable of returning the UUID of an ext4 
partition (which i assume a lot of root partitions are, and ext4 
partitions will always have a UUID), why not add a new choice in the 
case determined from `$(root_info fstype)`? It'd simply go like:


```
case $fstype in
    ext4)
        uuid="$(root_info uuid)" &&
            ROOTPART="UUID=$uuid" ||
            echo "raspi-firmware: warning: unable to determine ext4 
UUID for root partition." >&2

    ;;
```

    And such. More generally, the best solution I have to propose is 
for the hook to not query `source` first in most cases. Instead, it 
should try to find the UUID of the root partition, then fall back on 
`source` if one cannot be found. ZFS and Btrfs can still be handled as 
special cases too. Something like:


```
uuid="$(root_info uuid)" && ROOTPART="UUID=$uuid" || ROOTPART=/dev/mmcblk0p2
if [ -z "$ROOTPART" ] ; then
    # Fall back on `source` but warn that it could lead to inconsistent 
reboot.
    ROOTPART=$(root_info source) || echo "raspi-firmware: warning: 
unable to determine root fs mount point." >&2;
    echo "raspi-firmware: warning: fell back on root device path as 
kernel argument. Inconsistent device order may lead to non-booting 
device." >&2;


fi
```

    While scouring the issues open for `raspi-firmware` I also checked 
possible linked issues: #1055084 (either caused by this or at least 
unsolvable because of it).


    Until this issue is solved at the source, my personal workaround 
will be something like this written in `/etc/default/raspi-firmware` :


```
ROOTPART=UUID=
```

---

Links:
[0]: https://wiki.debian.org/RaspberryPiImages



Bug#996915: raspi-firmware: hook breaks label-based booting

2021-10-20 Thread Cyril Brulebois
Package: raspi-firmware
Version: 1.20210303+ds-2
Severity: important

Hi,

Building bullseye images using raspi-team/image-specs[1] for the Pi 4 B
or PI CM4 devices (`make raspi_4_bullseye.img`), I've been getting weird
things…

 1. https://salsa.debian.org/raspi-team/image-specs

Sometimes the boot would stop after a few seconds (last kernel log with
a timestamp around 6 seconds shown on HDMI), sometimes one would get a
prompt about a missing root filesystem, and be dropped inside an
initramfs prompt. (I've no certain answer at this point, but it seems
that the order of console= parameters in cmdline.txt might explain that
the prompt goes to the screen or the serial console; namely going to the
serial console during the first boot, and on screen for further boots,
due to the switched order after initial reconfiguration…)

Anyway, let's focus on the raspi-firmware issue:
 - image-specs generates a label-based booting, using this cmdline
   parameter: root=LABEL=RASPIROOT (similar to root=UUID=…).
 - The first boot ensures the raspi-firmware hook is called:
   /etc/kernel/postinst.d/z50-raspi-firmware
 - Running this hook results in the root= parameter's being replaced
   with the /dev/mmcblkNp2 du jour (it might be /dev/mmcblk0p2, it might
   be /dev/mmcblk1p2).
 - Further boots might see the same device come up with a different ID,
   leading to the root= vs. actual /dev/mmcblkNp2 mismatch that prevents
   booting.

In passing, the workaround in such cases is quite simple (when the
actual error messages are accessible, and one has access to a keyboard
or a serial console depending on cases):
 - symlink /dev/mmcblkXp2 to /dev/mmcblkYp2
 - exit the initramfs shell

and booting continues.


I think I'd prefer it if raspi-firmware wouldn't touch an existing root=
parameter. This would probably fix the very similar #984691 (about btrfs
booting).


Cheers,
-- 
Cyril Brulebois -- Debian Consultant @ DEBAMAX -- https://debamax.com/