On Fri, Jan 9, 2026 at 5:34 AM Thomas Zimmermann <[email protected]> wrote: > > Hi > > Am 29.12.25 um 22:58 schrieb Zack Rusin: > > Almost a rite of passage for every DRM developer and most Linux users > > is upgrading your DRM driver/updating boot flags/changing some config > > and having DRM driver fail at probe resulting in a blank screen. > > > > Currently there's no way to recover from DRM driver probe failure. PCI > > DRM driver explicitly throw out the existing sysfb to get exclusive > > access to PCI resources so if the probe fails the system is left without > > a functioning display driver. > > > > Add code to sysfb to recever system framebuffer when DRM driver's probe > > fails. This means that a DRM driver that fails to load reloads the system > > framebuffer driver. > > > > This works best with simpledrm. Without it Xorg won't recover because > > it still tries to load the vendor specific driver which ends up usually > > not working at all. With simpledrm the system recovers really nicely > > ending up with a working console and not a blank screen. > > > > There's a caveat in that some hardware might require some special magic > > register write to recover EFI display. I'd appreciate it a lot if > > maintainers could introduce a temporary failure in their drivers > > probe to validate that the sysfb recovers and they get a working console. > > The easiest way to double check it is by adding: > > /* XXX: Temporary failure to test sysfb restore - REMOVE BEFORE COMMIT */ > > dev_info(&pdev->dev, "Testing sysfb restore: forcing probe failure\n"); > > ret = -EINVAL; > > goto out_error; > > or such right after the devm_aperture_remove_conflicting_pci_devices . > > Recovering the display like that is guess work and will at best work > with simple discrete devices where the framebuffer is always located in > a confined graphics aperture. > > But the problem you're trying to solve is a real one. > > What we'd want to do instead is to take the initial hardware state into > account when we do the initial mode-setting operation. > > The first step is to move each driver's remove_conflicting_devices call > to the latest possible location in the probe function. We usually do it > first, because that's easy. But on most hardware, it could happen much > later.
Well, some drivers (vbox, vmwgfx, bochs and currus-qemu) do it because they request pci regions which is going to fail otherwise. Because grabbining the pci resources is in general the very first thing that those drivers need to do to setup anything, we remove_conflicting_devices first or at least very early. I also don't think it's possible or even desirable by some drivers to reuse the initial state, good example here is vmwgfx where by default some people will setup their vm's with e.g. 8mb ram, when the vmwgfx loads we allow scanning out from system memory, so you can set your vm up with 8mb of vram but still use 4k resolutions when the driver loads, this way the suspend size of the vm is very predictable (tiny vram plus whatever ram was setup) while still allowing a lot of flexibility. In general I think however this is planned it's two or three separate series: 1) infrastructure to reload the sysfb driver (what this series is) 2) making sure that drivers that do want to recover cleanly actually clean out all the state on exit properly, 3) abstracting at least some of that cleanup in some driver independent way z
smime.p7s
Description: S/MIME Cryptographic Signature
