> From mkb Sun Feb 25 14:19:33 2018
> From: [email protected]
> To: [email protected]
> Subject: intermittent sluggish behavior; seems to be acpi related
>
> >Synopsis:    intermittent sluggish behavior; seems to be acpi related
> >Category:    amd64
> >Environment:
>       System      : OpenBSD 6.2
>       Details     : OpenBSD 6.2-current (GENERIC.MP) #10: Wed Feb 21 21:26:27 
> MST 2018
>                        
> [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>
>       Architecture: OpenBSD.amd64
>       Machine     : amd64
> >Description:
>       This is a Lenovo ThinkPad T480s.
>
>       Sometimes, when I boot my system, it runs great.  Other times, it runs
>       very sluggishly.  I'd say it's good about half the time.  I've
>       narrowed this down to ACPI on the following evidence.
>
>       A good boot:
>       $ uptime && ps auxwwk | grep acpi0
>        5:53PM  up 18 mins, 1 user, load averages: 0.00, 0.00, 0.00
>       root     45527  0.0  0.0     0     0 ??  DK     5:35PM    0:00.22 
> (acpi0)
>
>       A bad boot:
>       $ uptime && ps auxwwk | grep acpi0
>        4:45PM  up 18 mins, 1 user, load averages: 1.03, 1.00, 0.75
>       root     97711 87.0  0.0     0     0 ??  DK     4:27PM   15:43.95 
> (acpi0)
>
>       The system runs very sluggishly on a bad boot.  Starting an xterm
>       should be and is, on a good boot, instant.  On a bad boot, it takes
>       10 seconds or so.  Clearly something is wrong, but I haven't been able
>       to pinpoint what exactly is wrong.
>
>       Here's the acpidump output:
>
>       http://www.martinbrandenburg.com/2018/acpi.tar.gz
>
>       In an effort to find the problem, I enabled ACPI_DEBUG.  I couldn't
>       make any sense, and I'm afraid too much has scrolled off the top, but
>       in case any of it is useful, here it is:
>
>       http://www.martinbrandenburg.com/2018/bad.dmesg
>       http://www.martinbrandenburg.com/2018/good.dmesg
>
>       This seems to be related to another problem.  Sometimes when I boot
>       the BIOS outputs "Configuration changed -- restart the system" and
>       does so.  I admit to not recording every instance, but it seems that
>       when that occurs, the system is fine.  When the system boots without
>       restarting, the system is sluggish and I have the problems described.
>
>       I've booted a Linux live USB quite a few times, and never had this
>       kind of trouble there.  As long as I don't boot OpenBSD, I never see
>       the "Configuration changed -- restart the system" message.  But I'd
>       much prefer to actually use OpenBSD.
>
>       I can supply more information or run tests to gather more data if
>       needed.
> >How-To-Repeat:
>       Boot OpenBSD on a ThinkPad T480s and possibly other newer ThinkPads
>       until the problem occurs.
> >Fix:
>       Unknown.
>
>

I have some more information.

I had noticed that the problem always shows up after suspending my
system.

Prior to suspend, vmstat -i shows

irq144/acpi0                      318        0

After suspend, the system gets sluggish and acpi0's CPU time explodes as
described.  Then running vmstat -i periodically over the course of about
10 minutes reveals that ACPI interrupts just go up and up.

irq144/acpi0                   282494      152
irq144/acpi0                   385191      197
irq144/acpi0                   436550      218
irq144/acpi0                   517509      250
irq144/acpi0                   600715      280
irq144/acpi0                   737721      325

Putting a printf in acpi_gpe revealed that excepting one at boot, no GPE
events occur until after suspend the system, where a deluge of _L6F show
up.

Decompiling the AML revealed this had something to do with Thunderbolt.

I don't think OpenBSD supports Thunderbolt, and I don't care to use it
anyway.  I went to the BIOS to disable it, but found an option "Enable
Thunderbolt BIOS Assist Mode" which purported to be necessary for older
versions of Windows and Linux.  I enabled it.

This seems to stop the problem after suspend.

However, I still occassionally see them when I first boot, before
attempting to suspend.  The printf starts before /etc/rc even starts
running.

I have a USB-C to VGA adapter

uhidev2 at uhub1 port 1 configuration 1 interface 1 "Lenovo Lenovo USB-C to VGA 
Adapter" rev 2.01/0.00 addr 2
uhidev2: iclass 3/0, 237 report ids
uhid0 at uhidev2 reportid 237: input=0, output=0, feature=80
ugen2 at uhub1 port 1 configuration 1 "Lenovo Lenovo USB-C to VGA Adapter" rev 
2.01/0.00 addr 2

which also sometimes triggers it no matter whether the BIOS option is on
or off.  However, I have a similar USB-C to DisplayPort adapter which
does not.

I am now running the following patch, which at least makes the machine
usable and lets me see when the first bad interrupt has happened.
Obviously it isn't a real fix.

I'll update with more information if I find it.

Index: acpi.c
===================================================================
RCS file: /cvs/src/sys/dev/acpi/acpi.c,v
retrieving revision 1.340
diff -u -p -r1.340 acpi.c
--- acpi.c      19 Feb 2018 08:59:52 -0000      1.340
+++ acpi.c      5 Mar 2018 03:57:59 -0000
@@ -2179,6 +2179,15 @@ acpi_gpe(struct acpi_softc *sc, int gpe,
        struct aml_node *node = arg;
        uint8_t mask, en;
 
+       if (!sc->gpe_table[gpe].edge && gpe == 111) {
+       static unsigned short i;
+       if (i == 0) {
+               i++;
+               printf("acpi_gpe %d %s IGNORING\n", gpe, node->name);
+       }
+       } else {
+       printf("acpi_gpe %d %s\n", gpe, node->name);
+
        dnprintf(10, "handling GPE %.2x\n", gpe);
        aml_evalnode(sc, node, 0, NULL, NULL);
 
@@ -2187,6 +2196,7 @@ acpi_gpe(struct acpi_softc *sc, int gpe,
                acpi_write_pmreg(sc, ACPIREG_GPE_STS, gpe>>3, mask);
        en = acpi_read_pmreg(sc, ACPIREG_GPE_EN,  gpe>>3);
        acpi_write_pmreg(sc, ACPIREG_GPE_EN,  gpe>>3, en | mask);
+       }
        return (0);
 }
 

Reply via email to