On Wed, Oct 12, 2022 at 11:33 AM Mike Larkin <[email protected]> wrote: > > On Wed, Sep 28, 2022 at 10:38:43AM +0200, Sebastian Oswald wrote: > > On Tue, 27 Sep 2022 08:03:59 -0700 > > Mike Larkin <[email protected]> wrote: > > > > >On Tue, Sep 27, 2022 at 11:02:50AM +0200, Sebastian Oswald wrote: > > >> On Mon, 26 Sep 2022 17:57:23 -0700 > > >> Mike Larkin <[email protected]> wrote: > > >> > > >> >On Mon, Sep 26, 2022 at 05:40:04PM +0200, Sebastian Oswald wrote: > > >> >> >Synopsis: High interrupt load from acpi0 on Intel N5105 platform > > >> >> >Category: system > > >> >> >Environment: > > >> >> System : OpenBSD 7.1 > > >> >> Details : OpenBSD 7.1 (GENERIC.MP) #465: Mon Apr 11 > > >> >> 18:03:57 MDT 2022 > > >> >> [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > >> >> > > >> >> Architecture: OpenBSD.amd64 > > >> >> Machine : amd64 > > >> >> > > >> >> >Description: > > >> >> On multiple (3), freshly installed systems based on Jasper > > >> >> Lake > > >> >> Celeron N5105 platform, CPU0 has high interrupt rate at idle. > > >> >> > > >> >> >How-To-Repeat: > > >> >> Installed 7.1 from current usb image, reboot. > > >> >> > > >> >> # top | head -n6 > > >> >> load averages: 0.99, 0.97, 0.92 a-vpn1.gassner.lan 17:38:58 > > >> >> 26 processes: 25 idle, 1 on processor up 8:01 > > >> >> CPU0 states: 0.0% user, 0.0% nice, 14.3% sys, 0.5% spin, 77.0% > > >> >> intr, 8.3% idle > > >> >> CPU1 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% > > >> >> intr, 99.9% idle > > >> >> CPU2 states: 0.0% user, 0.0% nice, 0.0% sys, 0.0% spin, 0.0% > > >> >> intr, 99.9% idle > > >> >> CPU3 states: 0.1% user, 0.0% nice, 0.0% sys, 0.1% spin, 0.0% > > >> >> intr, 99.9% idle > > >> >> > > >> >> > > >> >> This output is from a freshly rebooted system; rates for irq96/acpi > > >> >> are > > >> >> always way above 8000: > > >> >> > > >> >> # vmstat -i > > >> >> interrupt total rate > > >> >> irq0/clock 20105 394 > > >> >> irq0/ipi 8656 169 > > >> >> irq144/com0 86 1 > > >> >> irq96/acpi0 445306 8731 > > >> >> irq145/inteldrm0 1137 22 > > >> >> irq100/nvme0 33913 664 > > >> >> irq114/igc0:0 74 1 > > >> >> irq115/igc0:1 222 4 > > >> >> irq116/igc0:2 41 0 > > >> >> irq117/igc0:3 34 0 > > >> >> irq118/igc0 2 0 > > >> >> Total 509576 9991 > > >> >> > > >> > > > >> >Could be stuck GPE. > > >> > > > >> >In acpi.c, around line 2273: > > >> > > > >> > dnprintf(10, "handling GPE %.2x\n", gpe); > > >> > > > >> >change that to > > >> > > > >> > printf("handling GPE %.2x\n", gpe); > > >> > > > >> >And see which GPE keeps firing. It's likely gonna make the system > > >> >somewhat > > >> >slower since you'll be spamming dmesg like crazy. > > >> > > > >> >then report back what GPE you found firing. > > >> > > >> Thank you for the quick reply. > > >> > > >> With that patch applied, immediately during boot stdout gets spammed > > >> with "handling GPE 6f". > > >> > > >> From doing a quick search, this seems to be usually caused by a broken > > >> ACPI implementation on the BIOS side? > > >> I already contacted the vendor to check for a newer BIOS version. > > >> > > >> In the meantime or if there isn't any patched BIOS available, is there > > >> a way to find out what event '6f' correlates to and disable/ignore > > >> handling of that interrupt? > > >> > > > > > ><snip> > > > > > >Seems to be a common problem with this machine, not only on OpenBSD. Google > > >_L6F GPE AL6F and you'll see that everyone else with the issue needed to > > >hack their AML or get a BIOS update. Looks like shoddy AML from AMIbios. > > > > > >If you want to disable it, you'll need to do that in the GPE handler in > > >acpi.c. > > > > > >-ml > > > > > > Yes, I also found a bunch on this topic, usually for other cheap > > Mainboards (mostly asrock). I don't have high hopes to get a > > patched BIOS from the vendor of those appliances, so I started looking > > into ways of 'fixing' (ignoring) that GPE on the OS side. > > Apparently most OSes have some way to override the DSDT; e.g. > > FreeBSD can override the AML at boot pretty easily: > > https://docs.freebsd.org/en/books/handbook/config/#_overriding_the_default_aml > > Is there any such mechanism in OpenBSD? > > no > > > /var/db/acpi/DSDT.2 on these systems actually contains the same code as > > mentioned here: > > https://forums.freebsd.org/threads/disabling-gpe6-gpe-flooding-prevention.56963/#post-324358 > > (interestingly, FreeBSD doesn't show the same behavior; total interrupt > > rate according to 'vmstat -i' is <100 at idle) > > > > Otherwise, how could disabling that GPE in acpi.c look like? > > Sorry to bother you with that, I'm merely a sysadmin with some very > > rudimentary coding skills (i.e. I can roughly follow what some code > > might be doing as long as it isn't too complex). > > > > I'm not sure this is a beginner task. But you could write a function like > acpi_enable_onegpe and instead make it clear the gpe and then call that > from acpi_attach. > > Generally, it's not worth the effort trying to fix broken hardware like this. > Because how do you know there isn't other brokenness elsewhere? > > > Thanks, > > Sebastian > >
Hi Mike, When you say acpi_attach, do you mean this one: https://github.com/openbsd/src/blob/9f172165b574c19186ae3a65383c7fa8c8839f78/sys/arch/amd64/amd64/acpi_machdep.c#L90 ? I am thinking of adding this mitigation to my kernel temporarily. Assuming I write `acpi_disable_onegpe`, I am thinking where to call it for 0x6F. Thanks, Igor.
