Hannu Vuolasaho <[email protected]> writes:
> to 4.12.2025 klo 0.16 Dave Voutila ([email protected]) kirjoitti:
>>
>> Hannu Vuolasaho <[email protected]> writes:
>>
>> > ke 3.12.2025 klo 1.42 Dave Voutila ([email protected]) kirjoitti:
>> >
>> > Hannu Vuolasaho <[email protected]> writes:
>> >
>> > >
>> > > So this is probably something processor specific.
>> > >
>> > > I haven't started goofing around the kernel yet but if someone has some
>> > good ideas, let me know.
>> >
>> > I have zero familiarity with what Linux kernel Arch builds/ships in its
>> > distro. Do other distros like Alpine work?
>> >
>> > Yesterday I would have said yes, but today Alpine has made a new release
>> > and it fails for the same reason: vmd: invalid
>> > fault_type 2
>> > Unfortunately it doesn't print anything. just dies. Something more for the
>> > parameter line is needed for more verbose
>> > boot.
>> >
>>
>> Any chance you have an Intel-cpu system to try it on as well?
>
> Boots with APU2 7.8 stable: hw.model=AMD GX-412TC SOC
> and with Thinkpad L450 less than month current: hw.model=Intel(R)
> Core(TM) i5-5200U CPU @ 2.20GHz
>
Ok, I've narrowed this down and this is entirely limited to any Zen
cpu. Now that I'm back in front of my own machine, I was able to debug
and see it's reading some bytes via a MOV from a kernel virtual address
mapped to 0xfed803c0.
There was a commit[1] to Linux that is causing this read from 0xfed803c0
triggering the unfinished MMIO emulation paths in vmd(8).
The commit message for posterity:
~~~~
x86/CPU/AMD: Print the reason for the last reset
The following register contains bits that indicate the cause for the
previous reset.
PMx000000C0 (FCH::PM::S5_RESET_STATUS)
This is useful for debug. The reasons for reset are broken into 6 high level
categories. Decode it by category and print during boot.
Specifics within a category are split off into debugging documentation.
The register is accessed indirectly through a "PM" port in the FCH. Use
MMIO access in order to avoid restrictions with legacy port access.
Use a late_initcall() to ensure that MMIO has been set up before trying to
access the register.
This register was introduced with AMD Family 17h, so avoid access on older
families. There is no CPUID feature bit for this register.
~~~~
So that's cool! There's no CPUID bit to say this thing exists or
not as it just keys off if the Zen feature bit existing.
This is rather odd...as even in Linux/KVM this means if they emulated
ACPI states they'd now need to emulate this register. /shrug
There are two "correct" ways to approach this that don't compromise
vmm(4)/vmd(8):
1. finish MMIO emulation -- this is taking some time, but the right way
to handle this
2. find a way to configure some guest physical memory in that range
that might trigger Linux's ioremap() to fail and short circuit
before the read occurs.
I will look into (2), but for now I'd say use older linux kernels or
(according to ChatGPT) you *might* be able to use a boot arg that
disables this special "init" function via:
initcall_blacklist=print_s5_reset_status_mmio
I haven't tested that, but curious if it works! It does presuppose you
can pass a boot arg...and not all bootloaders let you do so.
-dv
[1]
https://github.com/torvalds/linux/commit/ab8131028710d009ab93d6bffd2a2749ade909b0.patch