Hi Martin,
Thanks a lot for the diff. It didn't help. After booting with the patched
kernel, I issued a reboot from a xterm, and the system went to ddb. Here are
the results of the ps and trace commands:
uvm_fault(0xffffff0134d78100, 0x0, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at modeset_lock+0x2b0: cmpq 0(%rcx),%rax
ddb{1}> PID TID PPID UID S FLAGS WAIT COMMAND
53242 140915 50462 0 3 0x83 nanosleep reboot
50462 447 1 1000 3 0x83 wait bash
44503 511379 1 0 3 0x88 pause xenodm
57228 205546 1 0 3 0x80 mfsidl mount_mfs
48177 99188 0 0 3 0x14200 pgzero zerothread
78499 318571 0 0 3 0x14200 aiodoned aiodoned
66975 49943 0 0 3 0x14200 syncer update
45905 149138 0 0 3 0x14200 cleaner cleaner
29511 84033 0 0 3 0x14200 reaper reaper
59314 290681 0 0 3 0x14200 pgdaemon pagedaemon
51549 206600 0 0 3 0x14200 bored crynlk
4831 484843 0 0 3 0x14200 bored crypto
51880 32031 0 0 3 0x14200 usbtsk usbtask
48976 199055 0 0 3 0x14200 usbatsk usbatsk
81325 354209 0 0 3 0x14200 bored i915-hangcheck
43743 336938 0 0 3 0x14200 drmwet i915-dp
11772 253465 0 0 3 0x14200 bored i915
91755 452879 0 0 3 0x40014200 acpi0 acpi0
76852 74589 0 0 7 0x40014200 idle3
14835 470143 0 0 7 0x40014200 idle2
97789 488588 0 0 3 0x40014200 idle1
51594 2070 0 0 3 0x14200 bored sensors
29948 325963 0 0 3 0x14200 bored softnet
94725 237048 0 0 3 0x14200 bored systqmp
44329 381687 0 0 3 0x14200 bored systq
64334 158737 0 0 3 0x40014200 bored softclock
14022 470280 0 0 7 0x40014200 idle0
7062 379942 0 0 3 0x14200 bored sbar
1 276099 0 0 3 0x82 wait init
0 0 -1 0 3 0x10200 scheduler swapper
ddb{1}>
modeset_lock(ffff80000023da88,ffff800000cc8e00,ffff80000010a200,ffff80000023d9c8)
at modeset_lock+0x2b0
drm_modeset_lock_all(ffff80000013aa00) at drm_modeset_lock_all+0x15f
drm_fb_helper_restore_fbdev_mode_unlocked(ffff80000013aa00) at
drm_fb_helper_restore_fbdev_mode_unlocked+0x21
intel_fbdev_restore_mode(246) at intel_fbdev_restore_mode+0x21
drmclose(ffff800032c23538,ffffff01132e6720,ffffff01132e6720,1de8) at
drmclose+0x2dc
spec_close(7) at spec_close+0x1c4
VOP_CLOSE(23e7ff2dc7523add,ffff800032c806d0,ffffff01377d38a0,ffffff0100000007)
at VOP_CLOSE+0x36
vn_closefile(ffff800032c806d0,0) at vn_closefile+0xcb
closef(ffff800032c806d0,ffffff0111d33530) at closef+0xb3
fdfree(ffff800032bdc8f0) at fdfree+0x5b
exit1(ffff800032c23750,10,ffff800032c806d0) at exit1+0x189
sys_exit(ffffffff811bced3,ffff800032c23680,ffff800032c23750) at sys_exit+0x13
syscall() at syscall+0x270
--- syscall (number 1) ---
end of kernel
end trace frame: 0x7f7ffffd2430, count: -13
0x135ad330232a:
ddb{1}> syncing disks... panic: assertwaitok: non-zero mutex count: 1
Stopped at db_enter+0x5: popq %rbp
TID PID UID PRFLAGS PFLAGS CPU COMMAND
db_enter() at db_enter+0x5
panic() at panic+0x128
assertwaitok() at assertwaitok+0x4d
bufq_wait(4) at bufq_wait+0x18
bwrite(ffffff01115281b8) at bwrite+0xfa
VOP_BWRITE(23e7ff2dc7523add) at VOP_BWRITE+0x32
ffs_fsync(ffffffff811f95d0) at ffs_fsync+0x198
VOP_FSYNC(23e7ff2dc7523add,ffffffff81b6e938,ffffff0100000002,ffffff01377d3000)
at VOP_FSYNC+0x36
ffs_sync_vnode(ffff800032c22f28,ffffff0111528038) at ffs_sync_vnode+0x50
vfs_mount_foreach_vnode(2,ffffffff81b6e938,ffff80000087c400) at
vfs_mount_foreach_vnode+0x35
ffs_sync(1018,ffffffff81b6e938,ffff800032c806d0,ffff800032c23200) at
ffs_sync+0x8d
sys_sync(ffffffff81abd7e0,ffffffff81b1eed8,4900) at sys_sync+0x75
vfs_shutdown() at vfs_shutdown+0x39
boot(4900) at boot+0x49
end trace frame: 0xffff800032c23000, count: 0
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports. Insufficient info makes it difficult to find and fix bugs.
ddb{1}>
dumping to dev 4,1 offset 500767
----- Mail original -----
De: "Martin Pieuchot" <[email protected]>
À: "regis etourmy" <[email protected]>
Cc: [email protected]
Envoyé: Dimanche 29 Octobre 2017 00:46:54
Objet: Re: kernel panic when shutting down from X window, since 6.2 upgrade on
my HP EliteBook 2540p
On 28/10/17(Sat) 22:18, [email protected] wrote:
> Hello,
>
> Since my bug report, I read the beginning of the man page for crash. I now
> know that the ddb log can be retrieved with the dmesg command at the
> following boot. The log I attach today must be more accurate than the one I
> copied by hand last week. By the way, do I have to launch a trace command for
> every cpu?
>
> I experienced another way to crash my machine with the same error. While
> watching youtube videos with firefox, if I wait for the screen to blank for
> inactivity, the sound still works, but if I want to get the screen up again,
> the system crashes.
>
> I will read the crash man page ahead, and try to find more information from
> the crash dump.
>
> Thanks a lot for your excellent work.
Does the diff below help?
Index: linux_ww_mutex.h
===================================================================
RCS file: /cvs/src/sys/dev/pci/drm/linux_ww_mutex.h,v
retrieving revision 1.1
diff -u -p -r1.1 linux_ww_mutex.h
--- linux_ww_mutex.h 1 Jul 2017 16:14:10 -0000 1.1
+++ linux_ww_mutex.h 28 Oct 2017 22:41:54 -0000
@@ -96,7 +96,7 @@ ww_mutex_is_locked(struct ww_mutex *lock
* Return 1 if lock could be acquired, else 0 (contended).
*/
static inline int
-ww_mutex_trylock(struct ww_mutex *lock) {
+ww_mutex_trylock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) {
int res = 0;
mtx_enter(&lock->lock);
@@ -106,9 +106,16 @@ ww_mutex_trylock(struct ww_mutex *lock)
if (lock->acquired == 0) {
KASSERT(lock->ctx == NULL);
lock->acquired = 1;
+ lock->ctx = ctx;
lock->owner = curproc;
res = 1;
}
+ /*
+ * In case we already hold the ww_mutex, increase a count.
+ */
+ else if (lock->owner == curproc) {
+ res = 1;
+ }
mtx_leave(&lock->lock);
return res;
}
@@ -155,7 +162,7 @@ __ww_mutex_lock(struct ww_mutex *lock, s
* - We are in the slow-path (first lock to obtain).
*
* - No context was specified. We assume a single
- * resouce, so there is no danger of a deadlock.
+ * resource, so there is no danger of a deadlock.
*
* - An `older` process (`ctx`) tries to acquire a
* lock already held by a `younger` process.
Index: drm_modeset_lock.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/drm/drm_modeset_lock.c,v
retrieving revision 1.1
diff -u -p -r1.1 drm_modeset_lock.c
--- drm_modeset_lock.c 1 Jul 2017 16:14:10 -0000 1.1
+++ drm_modeset_lock.c 28 Oct 2017 22:42:19 -0000
@@ -309,7 +309,7 @@ static inline int modeset_lock(struct dr
if (ctx->trylock_only) {
lockdep_assert_held(&ctx->ww_ctx);
- if (!ww_mutex_trylock(&lock->mutex))
+ if (!ww_mutex_trylock(&lock->mutex, &ctx->ww_ctx))
return -EBUSY;
else
return 0;