Public bug reported:

As part of my response to the recent Meltdown and Spectre security
issues, I've started deploying the intel-microcode package (initially,
version 3.20170707.1~ubuntu16.04.0, though I realise this does not
include recent security-related fixes) to desktops and servers equipped
with Intel CPUs.

This has caused machine boots to sometimes fail, though the behaviour
does not appear deterministic.

The error reported by the kernel is:

 initramfs unpacking failed: junk in compressed archive

This then immediately leads to the kernel panicing, as the initramfs is
needed for mounting the local root filesystem.

(Fortunately, I have set the panic=300 kernel command-line option, so
physical machines that panic in this way will auto-reboot after 5
minutes, and can thus be rescued from afar via network boot.)

I've seen these failures on two different varieties of desktop (one
HP/Compaq, one Dell), and also on VMs hosted by VMware.  I believe that
this problem is a non-deterministic race-condition during the machine
early boot sequence—probably in the kernel—as the same machine with the
same disk contents can exhibit either working or failing behaviour on
subsequent boot attempts.

Unfortunately, this particular error message appears in three different
places in init/initramfs.c, so it's not precisely clear what specific
problem is occurring.

This problem has been difficult to reproduce on hosts reliably.
Machines that are affected by this issue typically present it on most
boot attempts, but this cannot be relied on.

Attempting to gather more information from the kernel via the 'debug'
command-line option produces more data, but this is difficult to
capture.  Attempting to also add "console=ttyS0" on a VM that was
reliably presenting this problem caused the error to stop triggering,
presumably due to changed timing.

The intel-microcode package works by prepending a prepared initramfs
image with a CPIO archive that contains microcode files, with
predictable names, for early application by the kernel.

See also:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/x86/microcode.txt

Removing the intel-microcode package, and thus regenerating initramfs
files without any CPIO archive prepended to them, appears to prevent
this issue from triggering.  My suspicion is that the kernel is failing
to handle this compound archive structure in a reliable way.

However, it's conceivable that this problem is not in the Linux kernel,
but in the GRUB2 bootloader in use on these machines.  As I understand
things, it is the responsibility of the GRUB2 bootloader to read the
kernel and initramfs files from disk, and execute them both together.
It's thus conceivable that the defect does not lie in the kernel, but
that the GRUB2 bootloader is instead failing to reliably parse the btrfs
root filesystem data-structures, and thus the kernel is correctly
rejecting an invalid initramfs payload being passed to it.

However, given I've been successfully using GRUB2 and btrfs in this way
without issue for some years with a variety of kernels and initramfs
configurations, this strikes me as being less likely.

I have no reason to believe that this issue is limited to this (major)
version of the kernel.

** Affects: linux-hwe (Ubuntu)
     Importance: Undecided
         Status: New

** Attachment added: "Photograph of HP/Compaq desktop machine presenting early 
boot panic"
   
https://bugs.launchpad.net/bugs/1743798/+attachment/5038285/+files/WP_20180115_11_34_03_Pro.jpg

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1743798

Title:
  Kernel sometimes panics during early boot if CPU microcode archive
  prepended to initramfs

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1743798/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to