On Tue, Nov 30, 2021 at 08:18:56PM +0100, Ben Hutchings wrote:
> On Tue, 2021-11-30 at 11:01 +0100, Fabian Grünbichler wrote:
> [...]
> > possibly interesting in that context (I asked/posted the link in 
> > #debian-kernel a few days ago as well) - these BTF sections now actually 
> > reference the BTF info in the kernel image itself (as part of the 
> > deduplication of shared information), which makes the latter part of the 
> > ABI, and AFAICT this is not (yet?) tracked in Debian..
> > 
> > https://lore.kernel.org/all/1637926692.uyvrkty41j.astr...@nora.none/
> > 
> > an otherwise ABI compatible kernel upgrade thus has the potential to 
> > break module loading altogether, and I'd recommend disabling the split 
> > BTF feature for the time being unless you plan on bumping ABI for every 
> > kernel update anyway.
> 
> Yes, that is interesting/concerning.
> 
> If we continue to not bump the ABI number on every update, then I
> think:
> 
> 1. In-tree modules should not be loadable between an upgrade and a
> reboot.  (This can happen already for specific modules, due to symbol
> version changes that we think don't affect out-of-tree modules.) 
> Alternatively, they could still be loadable but then their BTF info
> should be completely discarded.
> 
> 2. Out-of-tree modules should be built without BTF deduplication, or
> without BTF info.
> 
> The main reason for not bumping the ABI number every time is to avoid
> forcing an unnecessary rebuild of out-of-tree modules.  We could try
> switching to something like RHEL's "weak-update" mechanism where ABI-
> compatible out-of-tree modules are automatically linked into a new
> version's modules directory without rebuilding them.  In that case we
> would still need to implement item (2) above.

FWIW, I ran into this issue for real on a Sid system:

booted kernel: Linux host 5.15.0-2-amd64 #1 SMP Debian 5.15.5-1 (2021-11-26) 
x86_64 GNU/Linux
installed kernel: ii  linux-image-5.15.0-2-amd64 5.15.5-2     amd64        
Linux 5.15 for 64-bit PCs (signed)

attempting to (auto-)load any module not already loaded before the
upgrade:

Dec 26 17:18:48 host mtp-probe[319902]: checking bus 4, device 3: 
"/sys/devices/pci0000:00/0000:00:01.2/0000:02:00.0/0000:03:08.0/0000:05:00.3/usb4/4-4"
Dec 26 17:18:48 host mtp-probe[319902]: bus: 4, device: 3 was not an MTP device
Dec 26 17:18:49 host kernel: scsi 3:0:0:0: Direct-Access              
Multi-Reader  -0 1.00 PQ: 0 ANSI: 6
Dec 26 17:18:49 host kernel: scsi 3:0:0:1: Direct-Access              
Multi-Reader  -1 1.00 PQ: 0 ANSI: 6
Dec 26 17:18:49 host kernel: scsi 3:0:0:2: Direct-Access              
Multi-Reader  -2 1.00 PQ: 0 ANSI: 6
Dec 26 17:18:49 host kernel: scsi 3:0:0:3: Direct-Access              
Multi-Reader  -3 1.00 PQ: 0 ANSI: 6
Dec 26 17:18:49 host kernel: scsi 3:0:0:0: Attached scsi generic sg0 type 0
Dec 26 17:18:49 host kernel: scsi 3:0:0:1: Attached scsi generic sg1 type 0
Dec 26 17:18:49 host kernel: scsi 3:0:0:2: Attached scsi generic sg2 type 0
Dec 26 17:18:49 host kernel: scsi 3:0:0:3: Attached scsi generic sg3 type 0
Dec 26 17:18:49 host kernel: BPF:[86226] ENUM T_CONDITION_MET
Dec 26 17:18:49 host kernel: BPF:size=4 vlen=11
Dec 26 17:18:49 host kernel: BPF:
Dec 26 17:18:49 host kernel: BPF:Invalid name
Dec 26 17:18:49 host kernel: BPF:
Dec 26 17:18:49 host kernel: failed to validate module [sd_mod] BTF: -22

module loading fails until booted and on-disk kernel images match again
- either by downgrading the latter (to 5.15.5-1 in this case), or by
rebooting.

note that just disabling the relevant KConfig doesn't work in my
experience, since it will be automatically enabled again by the presence
of a split-BTF capable pahole version in the build environment. patching
the default value to 'n' does work though[0].

0: 
https://git.proxmox.com/?p=pve-kernel.git;a=commitdiff;h=bc1d1913898940cabcea142f75a2a4759790a503

Reply via email to