Bug#1036019: debian-installer: Broken X display with QEMU under UEFI with cirrus and std graphics

2023-05-17 Thread Cyril Brulebois
Control: tag -1 - patch

Hi,

Thanks for the proposed patch but as discussed elsewhere it seemed too
risky to force 32 bpp on everyone, so I went for what looked like the
least risky (adding bochs.ko and cirrus.ko, manually and for the time
being).

Ben Hutchings  (2023-05-14):
> I think the problem is this GRUB has native drivers for Bochs and
> Cirrus that reprogram the framebuffer bit depth, and the kernel is then
> confused about what the bit depth is supposed to be.  With QXL, GRUB
> doesn't have a native driver so it doesn't reconfigure the framebuffer.

I've spent some time trying to reproduce these issues under UEFI but
without Secure Boot, and I failed. So I've moved to learning how to sign
a Linux kernel (certutil, pesign, mokutil, etc.), and I've added some
debugging information in various places.

Under Secure Boot, with the default QEMU driver (std aka. bochs),
initialization happens via:

drivers/firmware/efi/libstub/x86-stub.c

and its setup_graphics() that grabs the screen info part of boot params
and starts by zero-ing it:

si = _params->screen_info;
memset(si, 0, sizeof(*si));

before trying efi_setup_gop() and setup_uga() in turn; the former being
current, the latter being the old standard.

Moving on to:

drivers/firmware/efi/libstub/gop.c

we see that its efi_setup_gop() calls setup_gop(), which in turn calls
find_gop(). That last one gets hold of a suitable GOP pointer:
  
https://uefi.org/specs/UEFI/2.10/12_Protocols_Console_Support.html#graphics-output-protocol

The rest of setup_gop() then uses information contained within that
structure to derive all relevant information, filling the screen_info
structure. That structure is then trusted by efifb, which can do nothing
else but fail miserably…

The si (screen_info) is set starting here:
  
https://elixir.bootlin.com/linux/v6.1.27/source/drivers/firmware/efi/libstub/gop.c#L534

Adding some debug, here's what I get with GRUB set to 800x600x24:

info->version: 0
info->horizontal_resolution: 1024
info->vertical_resolution: 768
info->pixel_format: 1
  info->pixel_information.red_mask: 0
  info->pixel_information.green_mask: 0
  info->pixel_information.blue_mask: 0
  info->pixel_information.reserved_mask: 0
info->pixels_per_scan_line: 1024

Let's see:

 - Of course width, height, and pixels_per_scan_line are incorrect.

 - pixel_format 1 means PIXEL_BGR_RESERVED_8BIT_PER_COLOR aka
   PixelBlueGreenRedReserved8BitPerColor in the spec, which means:

   A pixel is 32-bits and byte zero represents blue, byte one
   represents green, byte two represents red, and byte three is
   reserved. This is the definition for the physical frame buffer.
   The byte values for the red, green, and blue components represent
   the color intensity. This color intensity value range from a
   minimum intensity of 0 to maximum intensity of 255.

 - And masks are all 0.

So for this particular GRUB configuration to work, I've verified that
fixing all those fields was leading to a correct display via efifb
(having dropped bochs.ko to stick to efifb):

info->horizontal_resolution = 800;
info->vertical_resolution   = 600;
info->pixels_per_scan_line  = 800;

info->pixel_format = PIXEL_BIT_MASK;
info->pixel_information.red_mask  = 0x00ff;
info->pixel_information.green_mask= 0xff00;
info->pixel_information.blue_mask = 0x00ff;
info->pixel_information.reserved_mask = 0x;

Setting PIXEL_BIT_MASK means masks become relevant, and bits set in
those are added to determine the actual color depth, instead of an
hardcoded 32, giving me (and efifb) 24. And even:

efifb: mode is 800x600x24, linelength=2400, pages=1
efifb: scrolling: redraw
efifb: Truecolor: size=0:8:8:8, shift=0:16:8:0

instead of the dreaded:

efifb: mode is 1024x768x32, linelength=4096, pages=1
efifb: scrolling: redraw
efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0

Now, where to go from here? It seems pretty clear to me at this point
that the Linux kernel only relies on information that was obtained via
that GOP pointer, and does its best afterward.

The way the function call works seems pretty similar to what happens in
GRUB, so I'd think that the problem is likely *not* in the kernel, but
rather:

 - GRUB fails to set mode information properly.

 - OVMF drops the ball and return some default information.


> Unfortunately, with Secure Boot we have to use a monolithic GRUB build
> so I can't easily exclude video_bochs and video_cirrus to see if that
> improves matters.

Applying my new pesign skills on GRUB is the next step, but I have to
spend some time on another topic before Bookworm… It it possible that
trying to build a debug-enabled OVMF package might yield interesting
results, since AFAIUI that's the one implementing the back and forth…
If that's indeed the case, it should be easy to see what's written by
GRUB vs. what's read by Linux?



Bug#1036019: debian-installer: Broken X display with QEMU under UEFI with cirrus and std graphics

2023-05-14 Thread Ben Hutchings
On Sun, 2023-05-14 at 19:40 +0200, Ben Hutchings wrote:
[...]
> This works for me with all the QEMU graphics devices.  But I haven't
> tested on real hardware.

Now tested successfully on 2 custom desktops:

- Asus P8Z68-V LX motherboard, Intel Core i5 2500 CPU, integrated GPU
- ASRock B450 PRO4, AMD Ryzen 5 3600 CPU, Radeon RX580 GPU

and 2 laptops:

- Lenovo ThinkPad T420, Intel Core i5 2nd gen CPU, integrated GPU
- Lenovo ThinkPad T460, Intel Core i5 6th gen CPU, integrated GPU

Ben.

-- 
Ben Hutchings
Absolutum obsoletum. (If it works, it's out of date.) - Stafford Beer


signature.asc
Description: This is a digitally signed message part


Bug#1036019: debian-installer: Broken X display with QEMU under UEFI with cirrus and std graphics

2023-05-14 Thread Ben Hutchings
Control: tag -1 patch

On Sun, 2023-05-14 at 00:21 +0200, Ben Hutchings wrote:
[...]
> So I suppose there's a regression in either efifb or fbdev_drv.

I'm not spotting any functional changes in fbdev or the submodules it
depends on between bullseye and bookworm.  So this implicates either
efifb or, as you mentioned, GRUB.

> > Via QEMU, under BIOS and UEFI, results are:
> > 
> >   +-+-+-+-+
> >   |  Graphics   |  Bullseye 11.7  |  Bookworm RC 2  |  Daily builds   |
> >   +-+++++++
> >   | |  BIOS  |  UEFI  |  BIOS  |  UEFI  |  BIOS  |  UEFI  |
> >   +-+++++++
> >   | |   OK   |   OK   |   OK   |  KO-G  |   OK   |  KO-G  |
> >   | -vga std|   OK   |   OK   |   OK   |  KO-G  |   OK   |  KO-G  |
> >   | -vga cirrus |   OK   |   OK   |   OK   |  KO-S  |   OK   |  KO-S  |
> >   | -vga qxl|   OK   |   OK   |   OK   |   OK   |   OK   |   OK   |
> >   | -vga virtio |   OK   |   OK   |   OK   |   OK   |   OK   |   OK   |
> >   | -vga vmware |   OK   |   OK   |   OK   |   OK   |   OK   |   OK   |
> >   +-+++++++
> 
> I started testing with QEMU and OVMF from unstable, and I'm instead
> seeing Xorg failing to start in the same cases you see glitches.  The
> relevant error message seems to be this one:
> http://codesearch.debian.net/show?file=xorg-server_2%3A21.1.7-3%2Fhw%2Fxfree86%2Ffbdevhw%2Ffbdevhw.c=504
[...]

I tested with QEMU from bullseye and OVMF from unstable, and again I
saw Xorg failing to start, rather than glitches.  Weird.

I also patched the kernel to report the internal screen_info structure
and the fb_var_screeninfo structure passed in and out of
FBIOPUT_VSCREENINFO.  The key difference is:

- With -vga qxl, screen_info says 32 bpp, X wants 32 bpp, the kernel
  agrees with that.
- With -vga std or -vga cirrus screen_info says 24 bpp, X wants 32
  bpp, and the kernel says 24 bpp.

I think the problem is this GRUB has native drivers for Bochs and
Cirrus that reprogram the framebuffer bit depth, and the kernel is then
confused about what the bit depth is supposed to be.  With QXL, GRUB
doesn't have a native driver so it doesn't reconfigure the framebuffer.

Unfortunately, with Secure Boot we have to use a monolithic GRUB build
so I can't easily exclude video_bochs and video_cirrus to see if that
improves matters.

But what does works for me is:

--- a/build/boot/x86/grub/grub-efi.cfg
+++ b/build/boot/x86/grub/grub-efi.cfg
@@ -5,7 +5,7 @@ else
 fi
 
 if loadfont $font ; then
-  set gfxmode=800x600
+  set gfxmode=800x600x32
   set gfxpayload=keep
   insmod efi_gop
   insmod efi_uga
--- END ---

A full patch is attached.

This works for me with all the QEMU graphics devices.  But I haven't
tested on real hardware.


Ben.

-- 
Ben Hutchings
Absolutum obsoletum. (If it works, it's out of date.) - Stafford Beer
From 49a5e562850e3ae4f64ed2d61bd582d8adedc393 Mon Sep 17 00:00:00 2001
From: Ben Hutchings 
Date: Sun, 14 May 2023 19:17:45 +0200
Subject: [PATCH] Always use 32 bpp for GRUB EFI graphical menu (Closes:
 #1036019)

---
 build/boot/x86/grub/grub-efi.cfg | 2 +-
 debian/changelog | 4 
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/build/boot/x86/grub/grub-efi.cfg b/build/boot/x86/grub/grub-efi.cfg
index 0a9a67d48..14708c7bc 100644
--- a/build/boot/x86/grub/grub-efi.cfg
+++ b/build/boot/x86/grub/grub-efi.cfg
@@ -5,7 +5,7 @@ else
 fi
 
 if loadfont $font ; then
-  set gfxmode=800x600
+  set gfxmode=800x600x32
   set gfxpayload=keep
   insmod efi_gop
   insmod efi_uga
diff --git a/debian/changelog b/debian/changelog
index 4624187fe..6be6864b5 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,8 +1,12 @@
 debian-installer (20230428) UNRELEASED; urgency=medium
 
+  [ Cyril Brulebois ]
   * Bump Linux kernel ABI to 6.1.0-9.
   * Switch source format from 1.0 to 3.0 (native).
 
+  [ Ben Hutchings ]
+  * Always use 32 bpp for GRUB EFI graphical menu (Closes: #1036019)
+
  -- Cyril Brulebois   Thu, 27 Apr 2023 22:52:15 +0200
 
 debian-installer (20230427) unstable; urgency=medium


signature.asc
Description: This is a digitally signed message part


Bug#1036019: debian-installer: Broken X display with QEMU under UEFI with cirrus and std graphics

2023-05-14 Thread Cyril Brulebois
Cyril Brulebois  (2023-05-14):
> Also, I should note that while my focus was on netboot-gtk mini.iso
> (because it's much quicker to rebuild/tweak than a netinst image), I'm
> replicating those results with the netinst images:
[…]
>  - Bookworm RC 1 has a “text-like” GRUB, all good.
>  - Bookworm RC 2 has a “graphical” GRUB, issues!

While adjusting my “nasty” approach to make sure it would build on all
three modified archs (amd64, arm64, i386), it occurred to me that:

 - Of course the trivial patch wouldn't work, because some builds aren't
   “pure GTK” builds, like cdrom-xen, and that one would also need the
   686-pae flavour on i386.

 - Of course it wouldn't work on arm64 either, since that one doesn't
   ship vboxvideo.ko.

 - And more importantly, we have the fb-modules udeb in various places,
   including for builds that aren't about the graphical installer…


And at this point, it seems fair to say that at least the Linux kernel
isn't perfect, as problems show up even without X in the picture!

 - With Bookworm RC 1 netinst amd64 (again under UEFI), switch from
   default “Graphical install” to “Install”: the text installer shows
   up with both std and cirrus.

 - With Bookworm RC 2 netinst amd64 (again under UEFI), switch from
   default “Graphical install” to “Install”: the screen is garbled
   with std, split with cirrus.


This is easily confirmed:

 - Triggering a debian-installer “netboot” build (not “netboot-gtk”):
   the resulting mini.iso exhibits the same problems as Bookwork RC 2
   using “Install”, with both std and cirrus.

 - Patching that “netboot” build to benefit from the extra DRM modules
   makes those issues go away, with both std and cirrus.

 - Alternatively, not patching the “netboot” build but reverting the same
   patch as mentioned before makes those issues go away, with both std and
   cirrus:
 
https://salsa.debian.org/installer-team/debian-installer/-/commit/a4dc8c0fe7ad1a0c1506125ad9985f78819a1bb2


For the very short term (RC 3), I think I'll implement the following:

 1. Consider archs with the graphical installer (that's been my main
focus until a few hours ago, when I started realizing the console
without X was also impacted), even if other archs include fb-modules
as well.
This means: amd64, arm64, i386. Those happen to also do EFI/SB.

 2. Hardcode list of of modules to be added:
  drm_shmem_helper.ko
  drm_ttm_helper.ko
  drm_vram_helper.ko
  tiny/bochs.ko
  tiny/cirrus.ko
  ttm/ttm.ko
  vboxvideo/vboxvideo.ko [!arm64, i.e. amd64 and i386 only]

 3. For each of these 3 archs, deploy each of these modules. Do that for
each build that includes drm.ko (which should be synonymous with
fb-modules being deployed, given drm.ko is mandatory in the common
fb-modules file, included from the arch-specific ones in src:linux),
and do that without a condition on GTK detection or /usr/bin/Xorg's
presence.

This should be targeted enough (touching 3 archs, two of which are getting
a lot of attention; leaving all others entirely untouched), yet generic
enough to work around issues that show up in both text and graphical
versions of the installer, by patching all relevant builds (netboot,
netboot-gtk, those used by debian-cd, etc.).

I'll push a v2 of my nasty branch once I've performed some clean-up and
some more testing.


Cheers,
-- 
Cyril Brulebois (k...@debian.org)
D-I release manager -- Release team member -- Freelance Consultant


signature.asc
Description: PGP signature


Bug#1036019: debian-installer: Broken X display with QEMU under UEFI with cirrus and std graphics

2023-05-13 Thread Cyril Brulebois
Hi Ben,

Thanks for all those details!

Ben Hutchings  (2023-05-14):
> > 
> >   +-+-+-+-+
> >   |  Graphics   |  Bullseye 11.7  |  Bookworm RC 2  |  Daily builds   |
> >   +-+++++++
> >   | |  BIOS  |  UEFI  |  BIOS  |  UEFI  |  BIOS  |  UEFI  |
> >   +-+++++++
> >   | |   OK   |   OK   |   OK   |  KO-G  |   OK   |  KO-G  |
> >   | -vga std|   OK   |   OK   |   OK   |  KO-G  |   OK   |  KO-G  |
> >   | -vga cirrus |   OK   |   OK   |   OK   |  KO-S  |   OK   |  KO-S  |
> >   | -vga qxl|   OK   |   OK   |   OK   |   OK   |   OK   |   OK   |
> >   | -vga virtio |   OK   |   OK   |   OK   |   OK   |   OK   |   OK   |
> >   | -vga vmware |   OK   |   OK   |   OK   |   OK   |   OK   |   OK   |
> >   +-+++++++
> 
> I started testing with QEMU and OVMF from unstable, and I'm instead
> seeing Xorg failing to start in the same cases you see glitches.  The
> relevant error message seems to be this one:
> http://codesearch.debian.net/show?file=xorg-server_2%3A21.1.7-3%2Fhw%2Fxfree86%2Ffbdevhw%2Ffbdevhw.c=504

Checking RC 1, I'm seeing OK results for both `-vga std` (or no options)
and `-vga cirrus`. I should note GRUB itself is “text-like” with RC 1,
while it's “graphical” with RC 2.

Reverting the following commit in debian-installer.git and building a
netboot-gtk image against unstable gives me a working graphical
installer with `-vga std` (or no options) and `-vga cirrus`. I didn't
check the rest of the matrix though.
  
https://salsa.debian.org/installer-team/debian-installer/-/commit/a4dc8c0fe7ad1a0c1506125ad9985f78819a1bb2

So it looks to me the GRUB config fix uncovered a pre-existing bug, and
the linux version bump (6.1.20-1 → 6.1.20-2) between RC 1 and RC 2 isn't
a factor (xserver-xorg-* udebs didn't change).

Interestingly, switching to the bullseye branch and cherry-picking the
same GRUB config fix there, and rebuilding d-i against current bullseye,
I'm getting exactly the same problem: KO-G for std, KO-S for cirrus!

So it looks like this might be a rather old issue, rather than a
regression during the Bookworm release cycle.


Also, I should note that while my focus was on netboot-gtk mini.iso
(because it's much quicker to rebuild/tweak than a netinst image), I'm
replicating those results with the netinst images:
 - Bullseye has a “text-like” GRUB, all good.
 - Bookworm RC 1 has a “text-like” GRUB, all good.
 - Bookworm RC 2 has a “graphical” GRUB, issues!

> > Questions
> > =
> > 
> >  - Is it really to be expected that X and standard drivers would regress
> >this way when moving from Bullseye to Bookworm?
> 
> No.
> 
> >  - Or is it expected to require specific kernel modules while that wasn't
> >the case before? I've discovered this in VM environments, but maybe
> >similar things could be happening on bare metal as well, and maybe
> >some more modules should be considered for inclusion?
> 
> No.
> 
> >  - Is it acceptable to just bundle bochs, cirrus, and vboxvideo for the
> >time being (i.e. RC 3, RC 4, 12.0.0), be it via the nasty approach
> >or via a proper linux fb-modules inclusion?
> >  - Or does shipping those few modules risk breaking the kernel and/or X
> >on other platforms? (I'd definitely hope not!)
> 
> I would not expect so.  They get used on the installed system, so they
> probably work.

Copy all!

Note for a further session: instead of debugging d-i itself, it should
be possible to reproduce those issues in the installed system, by
keeping only a specific list of kernel modules and X drivers. Of course,
that means having GRUB in “graphical” mode as well (a quick check
suggests installing desktop-base, without plymouth*, is sufficient for
that part).

As a very quick experiment, I tried:
 - installing xfce4 and desktop-base;
 - rebooting;
 - X doesn't start directly, one needs to run startxfce4 from the
   console.

Then:
 - manually removing all X drivers except fbdev_drv.so;
 - manually removing both tiny/ drivers (bochs and cirrus);
 - rebuilding the initramfs;
 - rebooting.

This gives me the following:
 - std: black screen, not even seeing a console prompt;
 - cirrus: “garbled/split” screen symptoms in the console, and in X;
 - qxl: all good in the console and in X.

Interestingly, purging desktop-base gets me back to a “text-only” GRUB
prompt, but both std and cirrus are exhibiting “garbled/split” screen
symptoms in the console and in X.

I'll stop here, I just wanted to confirm one could reproduce those
issues within the installed system, which should almost always be a
debug-friendlier environment than d-i…

> > Proposal plan for d-i (Bookworm RC 3, RC 4, and 12.0.0)
> > =
> > 
> > Unless I received strong negative feedback before Monday (May 15th),
> > I 

Bug#1036019: debian-installer: Broken X display with QEMU under UEFI with cirrus and std graphics

2023-05-13 Thread Ben Hutchings
On Sat, 2023-05-13 at 10:22 +0200, Cyril Brulebois wrote:
[...]
> Kernel-side
> ===
> 
> The fb-modules udeb hasn't changed much since 4+ years, with some DRM
> modules getting added alongside existing ones, leading to the following
> contents in Bullseye (5.10.178-3):
[...]
> Those contents are defined via those files in linux.git:
> 
> kibi@tokyo:~/debian-kernel/linux.git (sid=)$ cat 
> debian/installer/modules/amd64/fb-modules
> #include 
> 
> vesafb ?
> vga16fb
> 
> kibi@tokyo:~/debian-kernel/linux.git (sid=)$ cat 
> debian/installer/modules/fb-modules
> # We don't include all DRM drivers here as on many platforms we can
> # call system firmware to get hold of a simple framebuffer

To expand on this comment, in the case of UEFI boot the efifb driver
should provide a simple framebuffer, and on BIOS vesafb should do it. 
Those are both built-in on x86, and efifb is also built-in on arm64 and
armhf.


[...]
> X-side
> ==

Both of the kernel drivers are old-style framebuffer drivers so in
Xorg, the appropriate generic driver is "fbdev", not "modesetting".

> Now, we know that the contents of xserver-xorg-core-udeb have changed a
> little between Bullseye and Bookworm (#1035014), but that doesn't seem
> to be a factor here.
> 
> I've tested 3 netboot/gtk/mini.iso to assess the situation:
> 
>  - mini-20210731+deb11u8.iso from Bullseye 11.7
>  - mini-20230427.iso from D-I Bookworm RC 2
>  - mini-daily.isofrom D-I daily builds (downloaded today)
> 
> If people want to replicate those tests, they're available at:
>   https://people.debian.org/~kibi/bug-drm-vs-uefi/
> 
> Or:
> 
> wget 
> https://deb.debian.org/debian/dists/bullseye/main/installer-amd64/20210731+deb11u8/images/netboot/gtk/mini.iso
>  -O mini-20210731+deb11u8.iso
> wget 
> https://deb.debian.org/debian/dists/bookworm/main/installer-amd64/20230427/images/netboot/gtk/mini.iso
>  -O mini-20230427.iso
> wget https://d-i.debian.org/daily-images/amd64/daily/netboot/gtk/mini.iso 
> -O mini-daily.iso

These all include fbdev_drv.so, and Xorg.log shows that the fbdev
driver is being used.

So I suppose there's a regression in either efifb or fbdev_drv.

> Via QEMU, under BIOS and UEFI, results are:
> 
>   +-+-+-+-+
>   |  Graphics   |  Bullseye 11.7  |  Bookworm RC 2  |  Daily builds   |
>   +-+++++++
>   | |  BIOS  |  UEFI  |  BIOS  |  UEFI  |  BIOS  |  UEFI  |
>   +-+++++++
>   | |   OK   |   OK   |   OK   |  KO-G  |   OK   |  KO-G  |
>   | -vga std|   OK   |   OK   |   OK   |  KO-G  |   OK   |  KO-G  |
>   | -vga cirrus |   OK   |   OK   |   OK   |  KO-S  |   OK   |  KO-S  |
>   | -vga qxl|   OK   |   OK   |   OK   |   OK   |   OK   |   OK   |
>   | -vga virtio |   OK   |   OK   |   OK   |   OK   |   OK   |   OK   |
>   | -vga vmware |   OK   |   OK   |   OK   |   OK   |   OK   |   OK   |
>   +-+++++++

I started testing with QEMU and OVMF from unstable, and I'm instead
seeing Xorg failing to start in the same cases you see glitches.  The
relevant error message seems to be this one:
http://codesearch.debian.net/show?file=xorg-server_2%3A21.1.7-3%2Fhw%2Fxfree86%2Ffbdevhw%2Ffbdevhw.c=504

[...]
> Questions
> =
> 
>  - Is it really to be expected that X and standard drivers would regress
>this way when moving from Bullseye to Bookworm?

No.

>  - Or is it expected to require specific kernel modules while that wasn't
>the case before? I've discovered this in VM environments, but maybe
>similar things could be happening on bare metal as well, and maybe
>some more modules should be considered for inclusion?

No.

>  - Is it acceptable to just bundle bochs, cirrus, and vboxvideo for the
>time being (i.e. RC 3, RC 4, 12.0.0), be it via the nasty approach
>or via a proper linux fb-modules inclusion?
>  - Or does shipping those few modules risk breaking the kernel and/or X
>on other platforms? (I'd definitely hope not!)

I would not expect so.  They get used on the installed system, so they
probably work.



[...]
> Proposal plan for d-i (Bookworm RC 3, RC 4, and 12.0.0)
> =
> 
> Unless I received strong negative feedback before Monday (May 15th),
> I plan on including the nasty approach in RC 3, and to revert it
> altogether in RC 4 if big bad regressions are reported:
>   
> https://salsa.debian.org/installer-team/debian-installer/-/commit/9fceca63273d0b501ea64d7b719acafc93a5b7fa
> 
> As a side note, keeping the bundling in src:debian-installer for the
> next few weeks makes us autonomous: we can enable and disable those
> extra modules without requiring a new linux upload… so it's nasty but I
> actually thought about the few advantages we were getting out of 

Bug#1036019: debian-installer: Broken X display with QEMU under UEFI with cirrus and std graphics

2023-05-13 Thread Cyril Brulebois
Package: debian-installer
Version: 20230427
Severity: important
X-Debbugs-Cc: debian-...@lists.debian.org, debian-ker...@lists.debian.org, 
debia...@lists.debian.org

Hi everyone,

I'm reaching out to all the aforementioned teams because I know nothing
about UEFI, kernel-side DRM modules, or X drivers, and I'd like to get
some feedback here.

If you need a TL;DR, you can skip to “Proposal plan for d-i”, which is
about my plans for the very next few hours, unless someone tells me the
proposal is crazy, unsafe, etc.


Backstory
=

Since we've been hitting and/or (re)discovering UEFI-specific issues
lately (#1033913), I decided to spend some time extending my usual
tests, traditionally run under QEMU with default settings, therefore
booted under BIOS, to also run them under UEFI (meaning also testing
Secure Boot without having to switch to baremetal).

I've been kindly pointed by regular image testers to the following page:
  https://wiki.debian.org/SecureBoot/VirtualMachine

But I was a little shocked to discover a broken X display when booting
under UEFI! It seems I'm not the only one since that page has the
following, even if there are no references to any bug reports:

-vga virtio - The Debian installer seems to have difficulties
  working with the standard VGA driver (and virtio
  should anyway have better performance) 

The test setup is described at the very end of this report, with my
current test target being specifically netboot/gtk/mini.iso for amd64.


Kernel-side
===

The fb-modules udeb hasn't changed much since 4+ years, with some DRM
modules getting added alongside existing ones, leading to the following
contents in Bullseye (5.10.178-3):

./lib/modules/5.10.0-22-amd64/kernel/drivers/gpu/drm/drm_kms_helper.ko
./lib/modules/5.10.0-22-amd64/kernel/drivers/gpu/drm/drm.ko
./lib/modules/5.10.0-22-amd64/kernel/drivers/gpu/drm/virtio/virtio-gpu.ko
./lib/modules/5.10.0-22-amd64/kernel/drivers/media/cec/core/cec.ko
./lib/modules/5.10.0-22-amd64/kernel/drivers/video/fbdev/vga16fb.ko
./lib/modules/5.10.0-22-amd64/kernel/drivers/video/vgastate.ko
./lib/modules/5.10.0-22-amd64/kernel/drivers/virtio/virtio_dma_buf.ko

and the following contents in Bookworm (6.1.27-1):

./lib/modules/6.1.0-9-amd64/kernel/drivers/gpu/drm/drm_kms_helper.ko
./lib/modules/6.1.0-9-amd64/kernel/drivers/gpu/drm/drm.ko
./lib/modules/6.1.0-9-amd64/kernel/drivers/gpu/drm/drm_shmem_helper.ko
./lib/modules/6.1.0-9-amd64/kernel/drivers/gpu/drm/virtio/virtio-gpu.ko
./lib/modules/6.1.0-9-amd64/kernel/drivers/video/fbdev/vga16fb.ko
./lib/modules/6.1.0-9-amd64/kernel/drivers/video/vgastate.ko
./lib/modules/6.1.0-9-amd64/kernel/drivers/virtio/virtio_dma_buf.ko

Those contents are defined via those files in linux.git:

kibi@tokyo:~/debian-kernel/linux.git (sid=)$ cat 
debian/installer/modules/amd64/fb-modules
#include 

vesafb ?
vga16fb

kibi@tokyo:~/debian-kernel/linux.git (sid=)$ cat 
debian/installer/modules/fb-modules
# We don't include all DRM drivers here as on many platforms we can
# call system firmware to get hold of a simple framebuffer

drm
drm_kms_helper
virtio-gpu ?


X-side
==

Now, we know that the contents of xserver-xorg-core-udeb have changed a
little between Bullseye and Bookworm (#1035014), but that doesn't seem
to be a factor here.

I've tested 3 netboot/gtk/mini.iso to assess the situation:

 - mini-20210731+deb11u8.iso from Bullseye 11.7
 - mini-20230427.iso from D-I Bookworm RC 2
 - mini-daily.isofrom D-I daily builds (downloaded today)

If people want to replicate those tests, they're available at:
  https://people.debian.org/~kibi/bug-drm-vs-uefi/

Or:

wget 
https://deb.debian.org/debian/dists/bullseye/main/installer-amd64/20210731+deb11u8/images/netboot/gtk/mini.iso
 -O mini-20210731+deb11u8.iso
wget 
https://deb.debian.org/debian/dists/bookworm/main/installer-amd64/20230427/images/netboot/gtk/mini.iso
 -O mini-20230427.iso
wget https://d-i.debian.org/daily-images/amd64/daily/netboot/gtk/mini.iso 
-O mini-daily.iso


Via QEMU, under BIOS and UEFI, results are:

  +-+-+-+-+
  |  Graphics   |  Bullseye 11.7  |  Bookworm RC 2  |  Daily builds   |
  +-+++++++
  | |  BIOS  |  UEFI  |  BIOS  |  UEFI  |  BIOS  |  UEFI  |
  +-+++++++
  | |   OK   |   OK   |   OK   |  KO-G  |   OK   |  KO-G  |
  | -vga std|   OK   |   OK   |   OK   |  KO-G  |   OK   |  KO-G  |
  | -vga cirrus |   OK   |   OK   |   OK   |  KO-S  |   OK   |  KO-S  |
  | -vga qxl|   OK   |   OK   |   OK   |   OK   |   OK   |   OK   |
  | -vga virtio |   OK   |   OK   |   OK   |   OK   |   OK   |   OK   |
  | -vga vmware |   OK   |   OK   |   OK   |   OK   |   OK