Second kexec_file_load (but not kexec_load) fails on i915 if CONFIG_INTEL_IOMMU_DEFAULT_ON=n

Askar Safin Thu, 03 Jul 2025 15:02:24 -0700

TL;DR: I found a bug in strange interaction in kexec_file_load (but not 
kexec_load) and i915
TL;DR#2: Second (sometimes third or forth) kexec (using kexec_file_load) fails 
on my particular hardware
TL;DR#3: I did 55 expirements, each of them required a lot of boots, in total I 
did 1908 boots


Okay, so I found a bug. Steps to reproduce:
- I have Dell Precision 7780
- I have recent Debian x86_64 sid installed (bug reproducible with both Debian 
kernels and mainline ones)
- Bug is reproducible on many kernels, including very recent ones, for example 
6.15.4
- Boot system, then do kexec into the same system using kexec_file_load. I. e. 
pass --kexec-file-syscall to "kexec" command
- Then kexec from this kexec'ed system again (i. e. you should do two kexec's 
in a row)
- Then do 3rd kexec, etc
- Repeat kexec's until you do 100 kexec's or your system start to misbehave

On my computer the system starts to misbehave after some number of kexec's. 
This always happens after 2nd kexec attempt.
I. e. the first kexec is always successful. But second sometimes is not.
I never was able to perform 100 kexec's in a row.
After some kexec attempt the system starts to misbehave: oopses, panics, locked 
system, etc.

Notes:

- I tried to bisect "kexec-tools" package, but bisect merely gave me commit, 
which switched to kexec_file_load as a default.
Bug is reproducible if we use kexec_file_load, but doesn't reproduce if we use 
kexec_load

- Bug is reproducible even if we boot via init=/bin/bash (note: this means that 
initramfs is still part of the boot process). (If we boot to normal GUI, bug is 
reproducible, too)

- When I reproduce I use this command line: "root=UUID=... rootflags=subvol=... 
ro init=..."

- Debian package "plymouth" is required for reproducing. (It reproduces with 
plymouth, but doesn't reproduce without plymouth.) But note that I never see 
actual plymouth screen! I. e. presence of
"plymouth" on the system somehow affects bug reproduciblity despite plymouth 
animation never actually shown. I don't know why this happens, but I suspect 
that I don't pass "splash" to kernel command line, and thus don't see plymouth 
screen. But I suspect that plymouth is still included to initramfs and from 
there somehow affects boot process

- Bug reproduces in Debian, but doesn't reproduce in Ubuntu. After a lot of 
expirementing I finally understood why: Ubuntu kernel has 
CONFIG_INTEL_IOMMU_DEFAULT_ON=y, and Debian kernel has not. Additional 
expirements found that it is culpit. I. e. the bug is reproducible with 
CONFIG_INTEL_IOMMU_DEFAULT_ON=n and not reproducbile with 
CONFIG_INTEL_IOMMU_DEFAULT_ON=y . (So advice for distributions: do what Ubuntu 
does, i. e. set CONFIG_INTEL_IOMMU_DEFAULT_ON=y to hide this bug)

- Bug is not reproducible in old enough kernels, so I did bisect on Linux. 
Bisect showed me these commits: d4a2393049..4a75f32fc7. I. e. bug is 
reproducible in 4a75f32fc7, but doesn't reproduce in d4a2393049. Between them 
there is a middle commit 52407c220c44c8dcc6a, which is not testable. Here are 
these commits:

commit 4a75f32fc783128d0c42ef73fa62a20379a66828
Author: Anusha Srivatsa <anusha.sriva...@intel.com>

   drm/i915/rpl-s: Add PCH Support for Raptor Lake S

commit 52407c220c44c8dcc6aa8aa35ffc8a2db3c849a9
Author: Anusha Srivatsa <anusha.sriva...@intel.com>

   drm/i915/rpl-s: Add PCI IDS for Raptor Lake S

It seems these commits merely added support for my Intel GPU model. So this is 
fake regression. I'm not sure this should be treated as proper regression and 
whether regzbot should be notified. (What do you think?)

Still formally this is regression: I did expirements and they show that bug 
present in 4a75f32fc783128d0c42 and not present before. (Side note: in latest 
kernels both wayland and x11 work, in d4a2393049 x11 works and wayland doesn't.)

I tried to reproduce the bug in Qemu, but I was unable to do so. It seems Intel 
GPU is required, maybe even my particular model.

Here is "lspci -vnn -d :*:0300" for my GPU:

00:02.0 VGA compatible controller [0300]: Intel Corporation Raptor Lake-S UHD 
Graphics [8086:a788] (rev 04) (prog-if 00 [VGA controller])
        Subsystem: Dell Raptor Lake-S UHD Graphics [1028:0c42]
        Flags: bus master, fast devsel, latency 0, IRQ 202, IOMMU group 0
        Memory at 604b000000 (64-bit, non-prefetchable) [size=16M]
        Memory at 4000000000 (64-bit, prefetchable) [size=256M]
        I/O ports at 3000 [size=64]
        Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
        Capabilities: [40] Vendor Specific Information: Len=0c <?>
        Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
        Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit-
        Capabilities: [d0] Power Management version 2
        Capabilities: [100] Process Address Space ID (PASID)
        Capabilities: [200] Address Translation Service (ATS)
        Capabilities: [300] Page Request Interface (PRI)
        Capabilities: [320] Single Root I/O Virtualization (SR-IOV)
        Kernel driver in use: i915
        Kernel modules: i915

dmidecode:
https://zerobin.net/?aebea072b93d8122#z4W9URnV+k9ZZErhP4etQkxlfpyRKf++uKMNoO5PGjs=

- I use "root=UUID=... rootflags=subvol=... ro init=..." as a command line for 
reproducing. If I add "recovery nomodeset dis_ucode_ldr" (this is options used 
by Ubuntu in recovery mode), the bug stops to reproduce

Again, in short, full list of things required for successful reproducing:
- Intel GPU, possibly my particular model
- Kernel with support for my model (4a75f32fc783128d0c42 and later up to 6.15.4)
- Kexec at least two times. (One kexec never fails, 100 kexec's in a row never 
succeed)
- kexec_file_load as opposed to kexec_load
- Initramfs
- Lack of parameters "recovery nomodeset dis_ucode_ldr" (i. e. one of them 
stops reproducing)
- plymouth
- CONFIG_INTEL_IOMMU_DEFAULT_ON=n

Removing of ANY of them stops the bug, and I proved this by lots of expirements.

In total I did 55+ expirements, each of them required up to 100 boots. In total 
I did 1908 (!!!!!!) boots on my physical laptop (I mean kexec boots here). No, 
I'm not faking this number, here is my actual directories with results:

user@subvolume:~$ ls /rbt/kx-results/
@rec-2025-06-29T201723Z-bad-4    @rec-2025-06-29T214650Z-good-60  
@rec-2025-07-03T050626Z-bad-41    @rec-2025-07-03T104125Z-bad-28    
@rec-2025-07-03T133705Z-bad-3
@rec-2025-06-29T203429Z-good-60  @rec-2025-06-29T215558Z-bad-8    
@rec-2025-07-03T060107Z-good-100  @rec-2025-07-03T111727Z-bad-13    
@rec-2025-07-03T141647Z-good-100
@rec-2025-06-29T205626Z-good-60  @rec-2025-07-01T042949Z-bad-12   
@rec-2025-07-03T074810Z-good-100  @rec-2025-07-03T122242Z-good-100  
@rec-2025-07-03T145705Z-good-100
@rec-2025-06-29T211612Z-bad-6    @rec-2025-07-02T120101Z-good-60  
@rec-2025-07-03T082914Z-good-100  @rec-2025-07-03T123958Z-bad-12    
@rec-2025-07-03T152406Z-bad-50
@rec-2025-06-29T212932Z-good-60  @rec-2025-07-03T031038Z-good-60  
@rec-2025-07-03T100615Z-good-100  @rec-2025-07-03T132116Z-good-100  
@rec-2025-07-03T154204Z-bad-15
user@subvolume:~$ ls /rbt/kx-manual-testing/
2025-07-01-03-19-good-6  2025-07-01-03-56-good-4  2025-07-01-05-28-bad-3  
2025-07-01-06-35-bad-2  2025-07-01-09-46-good-8
2025-07-01-03-44-good-3  2025-07-01-04-47-good-3  2025-07-01-06-19-bad-2  
2025-07-01-09-21-bad-2  2025-07-02-13-09-good
user@subvolume:~$ ls /rbt/kx-vanilla-results/
2025-06-30T005219Z_5.16.0-kx-df0cc57e057f18e4-3e17eec5ff024b63_1626_good_60     
 2025-06-30T023542Z_5.16.0-rc2-kx-87bb2a410dcfb617-9f30253daecd39e5_1663_bad_4
2025-06-30T012313Z_5.17.0-kx-f443e374ae131c16-91b07dce12a83fab_1674_bad_1       
 2025-06-30T032312Z_5.16.0-rc2-kx-c9ee950a2ca55ea0-854a1f40ce042801_1662_bad_6
2025-06-30T013555Z_5.16.0-kx-22ef12195e13c5ec-9aaf880b25942f2a_1668_bad_7       
 2025-06-30T033528Z_5.16.0-rc2-kx-ba884a411700dc56-854a1f40ce042801_1662_good_60
2025-06-30T014106Z_5.16.0-kx-9bcbf894b6872216-b828905f3cf12050_1664_bad_2       
 2025-06-30T034645Z_5.16.0-rc2-kx-d4a23930490df39f-854a1f40ce042801_1662_good_60
2025-06-30T014634Z_5.16.0-rc5-kx-cb6846fbb83b574c-83e7c6cf2ede57b4_1663_bad_6   
 2025-06-30T035232Z_5.16.0-rc2-kx-4a75f32fc783128d-854a1f40ce042801_1662_bad_5
2025-06-30T015713Z_5.16.0-rc2-kx-15bb79910fe734ad-9f30253daecd39e5_1663_good_60 
 2025-06-30T042058Z_5.16.0-rc2-kx-4a75f32fc783128d-854a1f40ce042801_1662_bad_1
2025-06-30T020235Z_5.16.0-rc5-kx-b06103b5325364e0-26176b9b704a5c24_1664_bad_6   
 2025-06-30T050000Z_6.15.4-kx-e60eb441596d1c70-2378f4efc5e956e5_2366_bad_2
2025-06-30T020717Z_5.16.0-rc5-kx-eacef9fd61dcf5ea-26176b9b704a5c24_1664_bad_1   
 2025-06-30T053011Z_6.15.4-kx-e60eb441596d1c70-2378f4efc5e956e5_2366_good_60
2025-06-30T021738Z_5.16.0-rc2-kx-67b858dd89932086-8d2f1d17f1e1933c_1662_good_60 
 2025-06-30T060619Z_5.16.0-rc2-kx-d4a23930490df39f-854a1f40ce042801_1662_good_60
2025-06-30T022759Z_5.16.0-rc2-kx-17815f624a90579a-854a1f40ce042801_1662_good_60 
 2025-06-30T061448Z_5.16.0-rc2-kx-4a75f32fc783128d-854a1f40ce042801_1662_bad_1

Each number in the end of file/directory name is number of boots. In total we 
have 1908 boots. Testing was mostly automatical, using my script.

Here is one example dmesg from mainline commit e60eb441596d1c70 (somewhere 
around 6.15.4):

https://zerobin.net/?119ff118fd47b363#BpziYs6dNz5PaT7H8w2hlveoEYa4DDtITGkyd9o57LE=

This is was dmesg from 2nd (and in the same time last) boot. The next boot (i. 
e. kexec) was unsuccessful. Corresponding config:

https://zerobin.net/?009c807e1df41af8#gnmrswlbaFbdPTuzNq6NFkQd/Jhb3Ds0ZlLiwNanXnc=

If you want results from all expirements, here is a link: 
https://filebin.net/45g2757b2iwaeen7 (1 Mb, expires after 7 days). Usually 
expirements come with full reproducer script.

But what I described above is already enough, I think this link is not needed.

I will be available for testing in coming days, then I will switch to other 
things, and so will not be available for testing.
If you want more time, then, please, ask for it, i. e. say me something like 
"Please, be available for testing in more 10 days".

--
Askar Safin
https://types.pl/@safinaskar

Second kexec_file_load (but not kexec_load) fails on i915 if CONFIG_INTEL_IOMMU_DEFAULT_ON=n

Reply via email to