Bug#958311: cloud kernel 5.5.0-2 does not boot under xen
Hi, On 16/02/2023 18:18, Samuel Thibault wrote: Andy Smith, le jeu. 16 févr. 2023 15:44:21 +, a ecrit: - The PV part of grub is quite old and from what I understand implemented in a strange way Ah, uh :/ that no one wants to maintain any more, so this part of grub is stuck without ability to understand the newer kernel compressions. I've been using the attached script to get around the problem. It tries to decompress the cloud kernel and then recompress it with something that grub-xen can handle. If recompression fails, it leaves an uncompressed kernel, which also works, in place. The script needs to be installed in /etc/kernel/postinst.d/. -- Aleksi Suhonen () ascii ribbon campaign /\ support plain text e-mail #!/bin/sh # SPDX-License-Identifier: GPL-2.0-only # -- # kernel-recompress-hook - recompress cloud kernels for grub-xen # Place this script in /etc/kernel/postinst.d/ # Requires: binutils, lz4, xz-utils # (c) 2021-2023 Aleksi Suhonen # # Distilled from extract-vmlinux by # (c) 2009,2010 Dick Streefland # (c) 2011 Corentin Chary # # -- KERNEL_VERSION="$1" KERNEL_PATH="$2" # The KERNEL_PATH must be valid if [ ! -f "${KERNEL_PATH}" ]; then echo >&2 "Kernel file '${KERNEL_PATH}' not found. Aborting" exit 1 fi # Prepare temp files: tmp=$(mktemp /boot/vmlinux-X) trap "rm -f $tmp ${tmp}.xz" 0 # Try to find the LZ4 header and decompress from here LZ4_HEADER="$(printf '\002!L\030')" for pos in `fgrep -abo "$LZ4_HEADER" "$KERNEL_PATH"` do # grep counts bytes from 0, while tail counts from 1... pos=$((1+${pos%%:*})) rm -f $tmp tail -c+$pos "$KERNEL_PATH" | lz4 -d >$tmp readelf -h $tmp >/dev/null 2>&1 && break echo "False LZ4 header at ${pos}, looking for more..." done if ! readelf -h $tmp > /dev/null; then echo >&2 "Decompression of kernel file '${KERNEL_PATH}' failed!, not a valid ELF image" exit 0 fi echo "Decompression of kernel file '${KERNEL_PATH}' successful" if xz --check=crc32 --x86 --lzma2=dict=32MiB $tmp; then echo "Recompression also successful" mv -fv ${tmp}.xz ${KERNEL_PATH} echo "Replacement of kernel file successful" else echo "Recompression unsuccessful" mv -fv ${tmp} ${KERNEL_PATH} echo "Replacement of kernel file successful" fi exit 0 #EOF
Bug#958311: cloud kernel 5.5.0-2 does not boot under xen
Andy Smith, le jeu. 16 févr. 2023 15:44:21 +, a ecrit: > - The PV part of grub is quite old and from what I understand > implemented in a strange way Ah, uh :/ > that no one wants to maintain any > more, so this part of grub is stuck without ability to > understand the newer kernel compressions. Ok :/ Samuel
Bug#958311: cloud kernel 5.5.0-2 does not boot under xen
Hi Samuel, On Thu, Feb 16, 2023 at 03:59:09PM +0100, Samuel Thibault wrote: > Andy Smith, le jeu. 09 juin 2022 15:32:38 +, a ecrit: > > If you're using pvgrub2 to boot PV mode then the bad news is that it > > seems to be largely abandoned as nobody wants to alter it to support > > different kernel compression methods. > > Uh... I wonder how it is that it's not just orthogonal to whether > booting in native/PV/PVH... It's because: - "native" booting is the xen hypervisor itself booting your Debian kernel from out of its own filesystem, so the hypervisor needs to understand all kernel compression methods and historically it has lagged behind upstream kernel compression types. Grub is not involved here. - PV grub and PVH grub are both grub binaries that are booted by the hypervisor, so the hypervisor just needs to know how to boot that grub image, which is (a) a slower-moving target and (b) something that admin of the bare metal host can recompile easily without interfering with guests. HOWEVER - The PV part of grub is quite old and from what I understand implemented in a strange way that no one wants to maintain any more, so this part of grub is stuck without ability to understand the newer kernel compressions. - The PVH part of grub is more modern and uses grub's own facilities for loading the kernel file, so as long as grub understands a compression for normal Linux, it works with the PVH part of grub too, which is obviously a lot more maintainable. So in summary: For Xen there's two different places for implementing the understanding of loading a Linux kernel, that being the hypervisor or the grub bootloader. The hypervisor is slower upstream to support new kernel compressions so there has been times when for example Ubuntu or Fedora has by default been unbootable directly without getting a pre-release version of Xen hypervisor (or un/repacking the guest kernel compression). The grub method is preferred so that guests can manage their own kernels, and that has two different code paths in grub depending upon PV mode or PVH mode. The PV mode part doesn't seem to be maintained. The PVH part uses more of core grub functionality so is easier to maintain. I would recommend defaulting to PVH mode for guests these days, unless you are doing HVM. There are still people who want to use old kernels that don't support PVH mode, though. I don't think any of those old kernels are supported by Debian at this stage, but still… Cheers, Andy
Bug#958311: cloud kernel 5.5.0-2 does not boot under xen
Hello, Getting the same issue :) Andy Smith, le jeu. 09 juin 2022 15:32:38 +, a ecrit: > On Thu, Jun 09, 2022 at 02:00:30PM +0300, Aleksi Suhonen wrote: > > The underlying problem is that the cloud kernel is compressed with an > > algorithm that grub can't uncompress. What I've been doing as a workaround > > is that I decompress the kernel myself in a kernel install hook. > > Can you show us your xen domu config file? I'm interested in what > method you are using to boot these. Using /usr/lib/grub-xen/grub-x86_64-xen.bin here. > If you're using pvgrub2 to boot PV mode then the bad news is that it > seems to be largely abandoned as nobody wants to alter it to support > different kernel compression methods. Uh... I wonder how it is that it's not just orthogonal to whether booting in native/PV/PVH... > The good news is that you should be able to easily switch to PVH > mode with pvhgrub which uses grub's core routines to decompress the > kernel and therefore supports whatever compression methods that grub > normally does. Ok, so why not switch to PVH indeed. I just replaced kernel = '/usr/lib/grub-xen/grub-x86_64-xen.bin' with type="pvh" kernel = '/usr/lib/grub-xen/grub-i386-xen_pvh.bin' and it went fine with the cloud image indeed! Though now I have to fix the default console, to get kernel messages on hvc0: console=hvc0 Samuel
Bug#958311: cloud kernel 5.5.0-2 does not boot under xen
Hi, On Thu, Jun 09, 2022 at 02:00:30PM +0300, Aleksi Suhonen wrote: > The underlying problem is that the cloud kernel is compressed with an > algorithm that grub can't uncompress. What I've been doing as a workaround > is that I decompress the kernel myself in a kernel install hook. Can you show us your xen domu config file? I'm interested in what method you are using to boot these. If you're using pvgrub2 to boot PV mode then the bad news is that it seems to be largely abandoned as nobody wants to alter it to support different kernel compression methods. The good news is that you should be able to easily switch to PVH mode with pvhgrub which uses grub's core routines to decompress the kernel and therefore supports whatever compression methods that grub normally does. Cheers, Andy
Bug#958311: cloud kernel 5.5.0-2 does not boot under xen
Hi, On Thursday, 9 June 2022 13:00:30 CEST Aleksi Suhonen wrote: > On 07/06/2022 16:36, Diederik de Haas wrote: > > Can you still reproduce this issue on a more recent kernel (5.10 or > > later)? > > 5.10 and many later kernels still have this issue. I haven't tried 5.17 > or 5.18, but I suspect they still do. I think that's a reasonable assumption. > The underlying problem is that the cloud kernel is compressed with an > algorithm that grub can't uncompress. What I've been doing as a > workaround is that I decompress the kernel myself in a kernel install hook. That's quite a bit of important new info! This could mean that the problem is actually in grub? > Dom0 package versions on the test machine: > > ii grub-xen-host 2.02+dfsg1-9 > ii linux-image-4.18.0-3-amd644.18.20-2 > ii xen-hypervisor-4.11-amd64 4.11.1~pre.20180911.5acdd26fdc+dfsg-5 Have you upgraded your Dom0 to Stable? That has grub 2.04-20, kernel 5.10 and Xen version 4.14.x and that may just fix the issue. I haven't used the cloud kernel myself, but I do have a machine running Xen and the compression used for cloud kernels is (AFAIK) the same as for 'normal' kernels and I never have problems booting domU's (apart from when I mess things up myself ;-P). signature.asc Description: This is a digitally signed message part.
Bug#958311: cloud kernel 5.5.0-2 does not boot under xen
Hello, On 07/06/2022 16:36, Diederik de Haas wrote: Here's the error message: Loading Linux 5.6.0-1-cloud-amd64 ... error: not xen image. Loading initial ramdisk ... error: you need to load the kernel first. Can you still reproduce this issue on a more recent kernel (5.10 or later)? 5.10 and many later kernels still have this issue. I haven't tried 5.17 or 5.18, but I suspect they still do. The underlying problem is that the cloud kernel is compressed with an algorithm that grub can't uncompress. What I've been doing as a workaround is that I decompress the kernel myself in a kernel install hook. I'm a bit busy this week and next with a couple of huge events, so I don't have to provide more details until after those are done. Best Regards, -- Aleksi Suhonen () ascii ribbon campaign /\ support plain text e-mail
Bug#958311: cloud kernel 5.5.0-2 does not boot under xen
Control: reassign -1 src:linux 5.5.17-1 Control: found -1 linux/5.6.7-1 Control: tag -1 moreinfo On Mon, 20 Apr 2020 15:31:12 +0300 Aleksi Suhonen wrote: > Package: linux-image-cloud-amd64 > Version: 5.5.17-1 > Severity: important > > Linux cloud image 5.5.0-2 refuses to boot in a Xen VM for me. Older > versions up to and including 5.5.0-1 work. I'm using grub-xen to boot > the VMs. > ... On Mon, 4 May 2020 05:56:50 +0300 Aleksi Suhonen wrote: > As an update to my earlier message, I upgraded my test DomU to the > following versions and I still have the problem: > > ii grub-xen 2.04-7 > ii linux-image-5.6.0-1-cloud-amd64 5.6.7-1 > > Here's the error message: > > Loading Linux 5.6.0-1-cloud-amd64 ... > error: not xen image. > Loading initial ramdisk ... > error: you need to load the kernel first. Can you still reproduce this issue on a more recent kernel (5.10 or later)? https://salsa.debian.org/kernel-team/linux/-/commit/ 8c21ec896dfe43902e9a1749a93786340fe63b49 is a commit which is part of 5.5.0-2, but not 5.5.0-1 and seems like the potential cause. But before looking further into that, I'd like to know if the problem still exists or has been remedied since. signature.asc Description: This is a digitally signed message part.
Bug#958311: cloud kernel 5.5.0-2 does not boot under xen
Package: linux-image-cloud-amd64 Version: 5.5.17-1 Severity: important Linux cloud image 5.5.0-2 refuses to boot in a Xen VM for me. Older versions up to and including 5.5.0-1 work. I'm using grub-xen to boot the VMs. The grub-xen (i think) complains that it's not a Xen kernel and goes back to the menu after a few seconds. DomU package versions on the test machine: ii grub-xen2.04-6 ii linux-image-5.5.0-2-cloud-amd64 5.5.17-1 Dom0 package versions on the test machine: ii grub-xen-host 2.02+dfsg1-9 ii linux-image-4.18.0-3-amd64 4.18.20-2 ii xen-hypervisor-4.11-amd64 4.11.1~pre.20180911.5acdd26fdc+dfsg-5 I skimmed through the patch notes, and couldn't find anything that should cause this behaviour. -- Aleksi Suhonen () ascii ribbon campaign /\ support plain text e-mail