Bug#958311: cloud kernel 5.5.0-2 does not boot under xen

2023-02-20 Thread Aleksi Suhonen

Hi,

On 16/02/2023 18:18, Samuel Thibault wrote:

Andy Smith, le jeu. 16 févr. 2023 15:44:21 +, a ecrit:

   - The PV part of grub is quite old and from what I understand
 implemented in a strange way


Ah, uh :/


 that no one wants to maintain any
 more, so this part of grub is stuck without ability to
 understand the newer kernel compressions.


I've been using the attached script to get around the problem. It tries 
to decompress the cloud kernel and then recompress it with something 
that grub-xen can handle. If recompression fails, it leaves an 
uncompressed kernel, which also works, in place.


The script needs to be installed in /etc/kernel/postinst.d/.

--
Aleksi Suhonen

() ascii ribbon campaign
/\ support plain text e-mail
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0-only
# --
# kernel-recompress-hook - recompress cloud kernels for grub-xen
#   Place this script in /etc/kernel/postinst.d/
#   Requires: binutils, lz4, xz-utils
# (c) 2021-2023 Aleksi Suhonen 
#
# Distilled from extract-vmlinux by
# (c) 2009,2010 Dick Streefland 
# (c) 2011  Corentin Chary 
#
# --

KERNEL_VERSION="$1"
KERNEL_PATH="$2"

# The KERNEL_PATH must be valid
if [ ! -f "${KERNEL_PATH}" ]; then
echo >&2 "Kernel file '${KERNEL_PATH}' not found. Aborting"
exit 1
fi

# Prepare temp files:
tmp=$(mktemp /boot/vmlinux-X)
trap "rm -f $tmp ${tmp}.xz" 0

# Try to find the LZ4 header and decompress from here
LZ4_HEADER="$(printf '\002!L\030')"
for pos in `fgrep -abo "$LZ4_HEADER" "$KERNEL_PATH"`
do
# grep counts bytes from 0, while tail counts from 1...
pos=$((1+${pos%%:*}))
rm -f $tmp
tail -c+$pos "$KERNEL_PATH" | lz4 -d >$tmp
readelf -h $tmp >/dev/null 2>&1 && break
echo "False LZ4 header at ${pos}, looking for more..."
done

if ! readelf -h $tmp > /dev/null; then
echo >&2 "Decompression of kernel file '${KERNEL_PATH}' failed!, not a 
valid ELF image"
exit 0
fi

echo "Decompression of kernel file '${KERNEL_PATH}' successful"

if xz --check=crc32 --x86 --lzma2=dict=32MiB $tmp; then
echo "Recompression also successful"
mv -fv ${tmp}.xz ${KERNEL_PATH}
echo "Replacement of kernel file successful"
else
echo "Recompression unsuccessful"
mv -fv ${tmp} ${KERNEL_PATH}
echo "Replacement of kernel file successful"
fi

exit 0
#EOF


Bug#958311: cloud kernel 5.5.0-2 does not boot under xen

2023-02-16 Thread Samuel Thibault
Andy Smith, le jeu. 16 févr. 2023 15:44:21 +, a ecrit:
>   - The PV part of grub is quite old and from what I understand
> implemented in a strange way

Ah, uh :/

> that no one wants to maintain any
> more, so this part of grub is stuck without ability to
> understand the newer kernel compressions.

Ok :/

Samuel



Bug#958311: cloud kernel 5.5.0-2 does not boot under xen

2023-02-16 Thread Andy Smith
Hi Samuel,

On Thu, Feb 16, 2023 at 03:59:09PM +0100, Samuel Thibault wrote:
> Andy Smith, le jeu. 09 juin 2022 15:32:38 +, a ecrit:
> > If you're using pvgrub2 to boot PV mode then the bad news is that it
> > seems to be largely abandoned as nobody wants to alter it to support
> > different kernel compression methods.
> 
> Uh... I wonder how it is that it's not just orthogonal to whether
> booting in native/PV/PVH...

It's because:

- "native" booting is the xen hypervisor itself booting your Debian
  kernel from out of its own filesystem, so the hypervisor needs to
  understand all kernel compression methods and historically it has
  lagged behind upstream kernel compression types. Grub is not
  involved here.

- PV grub and PVH grub are both grub binaries that are booted by the
  hypervisor, so the hypervisor just needs to know how to boot that
  grub image, which is (a) a slower-moving target and (b) something
  that admin of the bare metal host can recompile easily without
  interfering with guests.

  HOWEVER

  - The PV part of grub is quite old and from what I understand
implemented in a strange way that no one wants to maintain any
more, so this part of grub is stuck without ability to
understand the newer kernel compressions.

  - The PVH part of grub is more modern and uses grub's own
facilities for loading the kernel file, so as long as grub
understands a compression for normal Linux, it works with the
PVH part of grub too, which is obviously a lot more
maintainable.

So in summary:

For Xen there's two different places for implementing the
understanding of loading a Linux kernel, that being the hypervisor
or the grub bootloader. The hypervisor is slower upstream to support
new kernel compressions so there has been times when for example
Ubuntu or Fedora has by default been unbootable directly without
getting a pre-release version of Xen hypervisor (or un/repacking
the guest kernel compression).

The grub method is preferred so that guests can manage their own
kernels, and that has two different code paths in grub depending
upon PV mode or PVH mode. The PV mode part doesn't seem to be
maintained. The PVH part uses more of core grub functionality so is
easier to maintain.

I would recommend defaulting to PVH mode for guests these days,
unless you are doing HVM. There are still people who want to use
old kernels that don't support PVH mode, though. I don't think any of
those old kernels are supported by Debian at this stage, but still…

Cheers,
Andy



Bug#958311: cloud kernel 5.5.0-2 does not boot under xen

2023-02-16 Thread Samuel Thibault
Hello,

Getting the same issue :)

Andy Smith, le jeu. 09 juin 2022 15:32:38 +, a ecrit:
> On Thu, Jun 09, 2022 at 02:00:30PM +0300, Aleksi Suhonen wrote:
> > The underlying problem is that the cloud kernel is compressed with an
> > algorithm that grub can't uncompress. What I've been doing as a workaround
> > is that I decompress the kernel myself in a kernel install hook.
> 
> Can you show us your xen domu config file? I'm interested in what
> method you are using to boot these.

Using /usr/lib/grub-xen/grub-x86_64-xen.bin here.

> If you're using pvgrub2 to boot PV mode then the bad news is that it
> seems to be largely abandoned as nobody wants to alter it to support
> different kernel compression methods.

Uh... I wonder how it is that it's not just orthogonal to whether
booting in native/PV/PVH...

> The good news is that you should be able to easily switch to PVH
> mode with pvhgrub which uses grub's core routines to decompress the
> kernel and therefore supports whatever compression methods that grub
> normally does.

Ok, so why not switch to PVH indeed. I just replaced

kernel  = '/usr/lib/grub-xen/grub-x86_64-xen.bin'

with

type="pvh"
kernel  = '/usr/lib/grub-xen/grub-i386-xen_pvh.bin'

and it went fine with the cloud image indeed!

Though now I have to fix the default console, to get kernel messages on
hvc0:

console=hvc0

Samuel



Bug#958311: cloud kernel 5.5.0-2 does not boot under xen

2022-06-09 Thread Andy Smith
Hi,

On Thu, Jun 09, 2022 at 02:00:30PM +0300, Aleksi Suhonen wrote:
> The underlying problem is that the cloud kernel is compressed with an
> algorithm that grub can't uncompress. What I've been doing as a workaround
> is that I decompress the kernel myself in a kernel install hook.

Can you show us your xen domu config file? I'm interested in what
method you are using to boot these.

If you're using pvgrub2 to boot PV mode then the bad news is that it
seems to be largely abandoned as nobody wants to alter it to support
different kernel compression methods.

The good news is that you should be able to easily switch to PVH
mode with pvhgrub which uses grub's core routines to decompress the
kernel and therefore supports whatever compression methods that grub
normally does.

Cheers,
Andy



Bug#958311: cloud kernel 5.5.0-2 does not boot under xen

2022-06-09 Thread Diederik de Haas
Hi,

On Thursday, 9 June 2022 13:00:30 CEST Aleksi Suhonen wrote:
> On 07/06/2022 16:36, Diederik de Haas wrote:
> > Can you still reproduce this issue on a more recent kernel (5.10 or
> > later)?
> 
> 5.10 and many later kernels still have this issue. I haven't tried 5.17
> or 5.18, but I suspect they still do.

I think that's a reasonable assumption.

> The underlying problem is that the cloud kernel is compressed with an
> algorithm that grub can't uncompress. What I've been doing as a
> workaround is that I decompress the kernel myself in a kernel install hook.

That's quite a bit of important new info!
This could mean that the problem is actually in grub?

> Dom0 package versions on the test machine:
> 
> ii  grub-xen-host 2.02+dfsg1-9
> ii  linux-image-4.18.0-3-amd644.18.20-2
> ii  xen-hypervisor-4.11-amd64 4.11.1~pre.20180911.5acdd26fdc+dfsg-5

Have you upgraded your Dom0 to Stable? That has grub 2.04-20, kernel 5.10 and 
Xen version 4.14.x and that may just fix the issue.

I haven't used the cloud kernel myself, but I do have a machine running Xen 
and the compression used for cloud kernels is (AFAIK) the same as for 'normal' 
kernels and I never have problems booting domU's (apart from when I mess 
things up myself ;-P).


signature.asc
Description: This is a digitally signed message part.


Bug#958311: cloud kernel 5.5.0-2 does not boot under xen

2022-06-09 Thread Aleksi Suhonen

Hello,

On 07/06/2022 16:36, Diederik de Haas wrote:

Here's the error message:

Loading Linux 5.6.0-1-cloud-amd64 ...
error: not xen image.
Loading initial ramdisk ...
error: you need to load the kernel first.


Can you still reproduce this issue on a more recent kernel (5.10 or later)?


5.10 and many later kernels still have this issue. I haven't tried 5.17 
or 5.18, but I suspect they still do.


The underlying problem is that the cloud kernel is compressed with an 
algorithm that grub can't uncompress. What I've been doing as a 
workaround is that I decompress the kernel myself in a kernel install hook.


I'm a bit busy this week and next with a couple of huge events, so I 
don't have to provide more details until after those are done.


Best Regards,

--
Aleksi Suhonen

() ascii ribbon campaign
/\ support plain text e-mail



Bug#958311: cloud kernel 5.5.0-2 does not boot under xen

2022-06-07 Thread Diederik de Haas
Control: reassign -1 src:linux 5.5.17-1
Control: found -1 linux/5.6.7-1
Control: tag -1 moreinfo

On Mon, 20 Apr 2020 15:31:12 +0300 Aleksi Suhonen  wrote:
> Package: linux-image-cloud-amd64
> Version: 5.5.17-1
> Severity: important
> 
> Linux cloud image 5.5.0-2 refuses to boot in a Xen VM for me. Older 
> versions up to and including 5.5.0-1 work. I'm using grub-xen to boot 
> the VMs.
> ...

On Mon, 4 May 2020 05:56:50 +0300 Aleksi Suhonen  wrote:
> As an update to my earlier message, I upgraded my test DomU to the 
> following versions and I still have the problem:
> 
> ii  grub-xen 2.04-7
> ii  linux-image-5.6.0-1-cloud-amd64  5.6.7-1
> 
> Here's the error message:
> 
> Loading Linux 5.6.0-1-cloud-amd64 ...
> error: not xen image.
> Loading initial ramdisk ...
> error: you need to load the kernel first.

Can you still reproduce this issue on a more recent kernel (5.10 or later)?

https://salsa.debian.org/kernel-team/linux/-/commit/
8c21ec896dfe43902e9a1749a93786340fe63b49 is a commit which is part of 5.5.0-2, 
but not 5.5.0-1 and seems like the potential cause.
But before looking further into that, I'd like to know if the problem still 
exists or has been remedied since.

signature.asc
Description: This is a digitally signed message part.


Bug#958311: cloud kernel 5.5.0-2 does not boot under xen

2020-04-20 Thread Aleksi Suhonen

Package: linux-image-cloud-amd64
Version: 5.5.17-1
Severity: important

Linux cloud image 5.5.0-2 refuses to boot in a Xen VM for me. Older 
versions up to and including 5.5.0-1 work. I'm using grub-xen to boot 
the VMs.


The grub-xen (i think) complains that it's not a Xen kernel and goes 
back to the menu after a few seconds.


DomU package versions on the test machine:

ii  grub-xen2.04-6
ii  linux-image-5.5.0-2-cloud-amd64 5.5.17-1

Dom0 package versions on the test machine:

ii  grub-xen-host   2.02+dfsg1-9
ii  linux-image-4.18.0-3-amd64  4.18.20-2
ii  xen-hypervisor-4.11-amd64   4.11.1~pre.20180911.5acdd26fdc+dfsg-5

I skimmed through the patch notes, and couldn't find anything that 
should cause this behaviour.


--
Aleksi Suhonen

() ascii ribbon campaign
/\ support plain text e-mail