[Touch-packages] [Bug 1842320] Re: Can't boot: "error: out of memory." immediately after the grub menu

2022-08-18 Thread Ben Hoyt
A couple of days ago I upgraded my 21.10 system to 22.04 and got this
"out of memory" issue (Dell XPS 15 9550 with 3840x2160 display).

The fix in comment #41 worked (changing the compression to 19), and the
sequence of commands in #44 helped. However, just noting that there are
several typos in #44 that had me stumped for a while (it's the first
time I've used chroot to update grub).

So here are (I think!) the corrected commands to update grub:

$ sudo su -
# mkdir root
# mkdir boot
# mkdir efi
# cryptsetup luksOpen /dev/nvme0n1p3 nvme0n1p3_crypt
# mount /dev/mapper/vgubuntu-root ./root
# mount /dev/nvme0n1p1 ./efi   # was "nvme01p1"
# mount /dev/nvme0n1p2 ./boot  # was "nvme01p2"
# mount -B ./boot ./root/boot
# mount -B ./efi ./root/boot/efi
# mount -B /dev ./root/dev # was: mount -B ./dev ./root/boot/dev
# mount -B /dev/pts ./root/dev/pts # was: mount -B ./dev/pts ./root/boot/dev/pts
# mount -B /proc ./root/proc   # was: mount -B ./proc ./root/proc
# mount -B /sys ./root/sys # was: mount -B ./sys ./root/sys
# chroot ./root
# ### Now edit line 196 of /usr/sbin/mkinitramfs to to use -19 instead of -1
# update-initramfs -u -k all
# update-grub
# exit
# reboot

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to initramfs-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1842320

Title:
  Can't boot: "error: out of memory." immediately after the grub menu

Status in grub:
  Unknown
Status in OEM Priority Project:
  Triaged
Status in grub2-signed package in Ubuntu:
  Confirmed
Status in initramfs-tools package in Ubuntu:
  Confirmed
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  [Impact]

   * In some cases, if the users’ initramfs grow bigger, then it’ll
  likely not be able to be loaded by grub2.

   * Some real cases from OEM projects:

  In many built-in 4k monitor laptops with nvidia drivers, the u-d-c
  puts the nvidia*.ko to initramfs which grows the initramfs to ~120M.
  Also the gfxpayload=auto will remain to use 4K resolution since it’s
  what EFI POST passed.

  In this case, the grub isn't able to load initramfs because the
  grub_memalign() won't be able to get suitable memory for the larger
  file:

  ```
  #0 grub_memalign (align=1, size=592214020) at ../../../grub-core/kern/mm.c:376
  #1 0x7dd7b074 in grub_malloc (size=592214020) at 
../../../grub-core/kern/mm.c:408
  #2 0x7dd7a2c8 in grub_verifiers_open (io=0x7bc02d80, type=131076)
  at ../../../grub-core/kern/verifiers.c:150
  #3 0x7dd801d4 in grub_file_open (name=0x7bc02f00 
"/boot/initrd.img-5.17.0-1011-oem",
  type=131076) at ../../../grub-core/kern/file.c:121
  #4 0x7bcd5a30 in ?? ()
  #5 0x7fe21247 in ?? ()
  #6 0x7bc030c8 in ?? ()
  #7 0x00017fe21238 in ?? ()
  #8 0x7bcd5320 in ?? ()
  #9 0x7fe21250 in ?? ()
  #10 0x in ?? ()
  ```

  Based on grub_mm_dump, we can see the memory fragment (some parts seem
  likely be used because of 4K resolution?) and doesn’t have available
  contiguous memory for larger file as:

  ```
  grub_real_malloc(...)
  ...
  if (cur->size >= n + extra)
  ```

  Based on UEFI Specification Section 7.2[1] and UEFI driver writers’
  guide 4.2.3[2], we can ask 32bits+ on AllocatePages().

  As most X86_64 platforms should support 64 bits addressing, we should
  extend GRUB_EFI_MAX_USABLE_ADDRESS to 64 bits to get more available
  memory.

   * When users grown the initramfs, then probably will get initramfs
  not found which really annoyed and impact the user experience (system
  not able to boot).

  [Test Plan]

   * detailed instructions how to reproduce the bug:

  1. Any method to grow the initramfs, such as install nvidia-driver.

  2. If developers would like to reproduce, then could dd if=/dev/random
  of=... bs=1M count=500, something like:

  ```
  $ cat /usr/share/initramfs-tools/hooks/zzz-touch-a-file
  #!/bin/sh

  PREREQ=""

  prereqs()
  {
  echo "$PREREQ"
  }

  case $1 in
  # get pre-requisites
  prereqs)
  prereqs
  exit 0
  ;;
  esac

  . /usr/share/initramfs-tools/hook-functions
  dd if=/dev/random of=${DESTDIR}/test-500M bs=1M count=500
  ```

  And then update-initramfs

   * After applying my patches, the issue is gone.

   * I did also test my test grubx64.efi in:

  1. X86_64 qemu with
  1.1. 60M initramfs + 5.15.0-37-generic kernel
  1.2. 565M initramfs + 5.17.0-1011-oem kernel

  2. Amd64 HP mobile workstation with
  2.1. 65M initramfs + 5.15.0-39-generic kernel
  2.2. 771M initramfs + 5.17.0-1011-oem kernel

  All working well.

  [Where problems could occur]

  * The changes almost in i386/efi, thus the impact will be in the i386 / 
x86_64 EFI system.
  The other change is to modify the “grub-core/kern/efi/mm.c” but I use the 
original addressing for “arm/arm64/ia64/riscv32/riscv64”.
  Thus it should not impact them.

  * There is a “#if 

[Touch-packages] [Bug 1842320] Re: Can't boot: "error: out of memory." immediately after the grub menu

2022-08-15 Thread Ben Hoyt
Joseph, out of interest, what model of XPS 13 do you have? I'm getting
new XPS 13 soon and wondering if it'll have the same issue (I guess it's
likely given the 4k screen).

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to initramfs-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1842320

Title:
  Can't boot: "error: out of memory." immediately after the grub menu

Status in grub:
  Unknown
Status in OEM Priority Project:
  Triaged
Status in grub2-signed package in Ubuntu:
  Confirmed
Status in initramfs-tools package in Ubuntu:
  Confirmed
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  [Impact]

   * In some cases, if the users’ initramfs grow bigger, then it’ll
  likely not be able to be loaded by grub2.

   * Some real cases from OEM projects:

  In many built-in 4k monitor laptops with nvidia drivers, the u-d-c
  puts the nvidia*.ko to initramfs which grows the initramfs to ~120M.
  Also the gfxpayload=auto will remain to use 4K resolution since it’s
  what EFI POST passed.

  In this case, the grub isn't able to load initramfs because the
  grub_memalign() won't be able to get suitable memory for the larger
  file:

  ```
  #0 grub_memalign (align=1, size=592214020) at ../../../grub-core/kern/mm.c:376
  #1 0x7dd7b074 in grub_malloc (size=592214020) at 
../../../grub-core/kern/mm.c:408
  #2 0x7dd7a2c8 in grub_verifiers_open (io=0x7bc02d80, type=131076)
  at ../../../grub-core/kern/verifiers.c:150
  #3 0x7dd801d4 in grub_file_open (name=0x7bc02f00 
"/boot/initrd.img-5.17.0-1011-oem",
  type=131076) at ../../../grub-core/kern/file.c:121
  #4 0x7bcd5a30 in ?? ()
  #5 0x7fe21247 in ?? ()
  #6 0x7bc030c8 in ?? ()
  #7 0x00017fe21238 in ?? ()
  #8 0x7bcd5320 in ?? ()
  #9 0x7fe21250 in ?? ()
  #10 0x in ?? ()
  ```

  Based on grub_mm_dump, we can see the memory fragment (some parts seem
  likely be used because of 4K resolution?) and doesn’t have available
  contiguous memory for larger file as:

  ```
  grub_real_malloc(...)
  ...
  if (cur->size >= n + extra)
  ```

  Based on UEFI Specification Section 7.2[1] and UEFI driver writers’
  guide 4.2.3[2], we can ask 32bits+ on AllocatePages().

  As most X86_64 platforms should support 64 bits addressing, we should
  extend GRUB_EFI_MAX_USABLE_ADDRESS to 64 bits to get more available
  memory.

   * When users grown the initramfs, then probably will get initramfs
  not found which really annoyed and impact the user experience (system
  not able to boot).

  [Test Plan]

   * detailed instructions how to reproduce the bug:

  1. Any method to grow the initramfs, such as install nvidia-driver.

  2. If developers would like to reproduce, then could dd if=/dev/random
  of=... bs=1M count=500, something like:

  ```
  $ cat /usr/share/initramfs-tools/hooks/zzz-touch-a-file
  #!/bin/sh

  PREREQ=""

  prereqs()
  {
  echo "$PREREQ"
  }

  case $1 in
  # get pre-requisites
  prereqs)
  prereqs
  exit 0
  ;;
  esac

  . /usr/share/initramfs-tools/hook-functions
  dd if=/dev/random of=${DESTDIR}/test-500M bs=1M count=500
  ```

  And then update-initramfs

   * After applying my patches, the issue is gone.

   * I did also test my test grubx64.efi in:

  1. X86_64 qemu with
  1.1. 60M initramfs + 5.15.0-37-generic kernel
  1.2. 565M initramfs + 5.17.0-1011-oem kernel

  2. Amd64 HP mobile workstation with
  2.1. 65M initramfs + 5.15.0-39-generic kernel
  2.2. 771M initramfs + 5.17.0-1011-oem kernel

  All working well.

  [Where problems could occur]

  * The changes almost in i386/efi, thus the impact will be in the i386 / 
x86_64 EFI system.
  The other change is to modify the “grub-core/kern/efi/mm.c” but I use the 
original addressing for “arm/arm64/ia64/riscv32/riscv64”.
  Thus it should not impact them.

  * There is a “#if defined(__x86_64__)” which intent to limit the >
  32bits code in i386 system and also

  ```
   #if defined (__code_model_large__)
  -#define GRUB_EFI_MAX_USABLE_ADDRESS 0x
  +#define GRUB_EFI_MAX_USABLE_ADDRESS __UINTPTR_MAX__
  +#define GRUB_EFI_MAX_ALLOCATION_ADDRESS 0x7fff
   #else
   #define GRUB_EFI_MAX_USABLE_ADDRESS 0x7fff
  +#define GRUB_EFI_MAX_ALLOCATION_ADDRESS 0x3fff
   #endif
  ```

  If everything works as expected, then i386 should working good.

  If not lucky, based on “UEFI writers’ guide”[2], the i386 will get >
  4GB memory region and never be able to access.

  [Other Info]

   * Upstream grub2 bug #61058
  https://savannah.gnu.org/bugs/index.php?61058

   * Test PPA: https://launchpad.net/~os369510/+archive/ubuntu/lp1842320

   * Test grubx64.efi:
  https://people.canonical.com/~jeremysu/lp1842320/grubx64.efi.lp1842320

   * Test source code: https://github.com/os369510/grub2/tree/lp1842320

   * If you built the package, then test grubx64.efi is under