Hi,
Today I spent some time trying to debug the suspend issues on the
Thinkpad X1 5th generation. I started with the intention of tracking
down an old bug [1], but I think I've found a new one.
To trigger the bug, install a recent snapshot. I've managed to get this
panic using two kernels:
---8<---
OpenBSD 6.4-current (GENERIC.MP) #374: Sun Oct 21 00:04:11 MDT 2018
[email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
--->8---
and:
---8<---
OpenBSD 6.4-current (GENERIC.MP) #436: Sun Nov 11 23:59:55 MST 2018
[email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
--->8---
Install to a USB thumb drive with a single 'a' label (relevant?):
---8<---
sd1 at scsibus2 targ 1 lun 0: <, USB DISK 3.0, PMAP> SCSI4 0/direct removable
serial.13fe55006606C86B7615
sd1: 15120MB, 512 bytes/sector, 30965760 sectors
--->8---
I used a USB drive so that I wouldn't trash my everyday filesystems by
constant dirty shutdowns. However, it seems essential for reproduction.
If I boot a recent snap off my SSD, this bug does not manifest (but
perhaps [1] is related).
If I run `zzz`, the LED indicator will blink, the screen will go blank,
and that's it. Usually when this system suspends properly, the LED fades
up and down -- this does not happen this time.
When I spoke to krw@ earlier in the year, he shared a debugging tactic
that involves using the LED status indicator to see where the kernel
hangs. Via this tactic I narrowed the bug to this call in acpi.c:
---8<---
#endif /* HIBERNATE */
sensor_quiesce();
if (config_suspend_all(DVACT_QUIESCE)) // <-- THIS LINE HANGS
goto fail_quiesce;
vfs_stall(curproc, 1);
#if NSOFTRAID > 0
--->8---
After this I tried disabling devices via `boot -c` to see if I could
find which one causes the hang. I tried (off the top of my head) nvme,
uhid, iwm, em, uvideo, pms and inteldrm.
Interestingly when disable inteldrm, I see a panic on suspend. Sorry, no
serial line, but here are photos of the panic, ddb traces and ps:
http://theunixzoo.co.uk/random/zzz-panic.tar.gz
I think it's likely that this panic has always been happening, but
disabling inteldrm is allowing us to see ddb.
I need to spend some more time on this, but I wanted to share initial
findings. If anyone has any ideas, please shout!
Cheers
[1]: https://marc.info/?l=openbsd-bugs&m=151575724607508&w=2
--
Best Regards
Edd Barrett
http://www.theunixzoo.co.uk