On Thu, May 20, 2021 at 10:10:14AM +0200, Matthieu Herrb wrote:
> > procedure a few times now and it unfortunately does not produce any
> > coredumps
> > or traces.
>
> When the X server is locked up I can still ssh into the machine and
> attach a debugger to the running process. I've got a few backtraces
> from that, but without full symbols it's even harder to understand
> what's going on.
>
> I suspect issues with our futex implementation; in every case I find
> one thread stuck in a drm ioctl while others are blocked on futex
> waits.
>
> Running an X server + Mesa fully built with debug symbols seem to make
> the issue less frequent and when it happenned I didn't have time to
> launch a debugger on it so far...
>
Running with this xorg.conf only
#Section "Device"
# Identifier "amdgpu-modesetting"
# Driver "modesetting"
#EndSection
Section "ServerFlags"
Option "NoTrapSignals" "true"
EndSection
# per Alexandre Ratchov
Section "Device"
Identifier "Card0"
Driver "modesetting"
Option "SWCursor" "on"
Option "PageFlip" "off"
Option "AccelMethod" "none"
EndSection
Section "Module"
Disable "glx"
EndSection
I got a X crash on startup but was able to log in via ssh.
gdb reads all symbols as far as I can see:
[Thu May 20 14:00:09] peter@zelda:~$ doas gdb /usr/X11R6/bin/Xorg 40595
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-unknown-openbsd6.9"...
Attaching to program: /usr/X11R6/bin/Xorg, process 40595
Reading symbols from /usr/lib/libpthread.so.26.1...done.
Loaded symbols for /usr/lib/libpthread.so.26.1
Loaded symbols for /usr/X11R6/bin/Xorg
Symbols already loaded for /usr/lib/libpthread.so.26.1
Reading symbols from /usr/X11R6/lib/libpciaccess.so.2.0...done.
Loaded symbols for /usr/X11R6/lib/libpciaccess.so.2.0
Reading symbols from /usr/X11R6/lib/libdrm.so.7.9...done.
Loaded symbols for /usr/X11R6/lib/libdrm.so.7.9
Reading symbols from /usr/X11R6/lib/libpixman-1.so.38.4...done.
Loaded symbols for /usr/X11R6/lib/libpixman-1.so.38.4
Reading symbols from /usr/X11R6/lib/libXfont2.so.2.0...done.
Loaded symbols for /usr/X11R6/lib/libXfont2.so.2.0
Reading symbols from /usr/X11R6/lib/libfontenc.so.4.0...done.
Loaded symbols for /usr/X11R6/lib/libfontenc.so.4.0
Reading symbols from /usr/X11R6/lib/libfreetype.so.30.0...done.
Loaded symbols for /usr/X11R6/lib/libfreetype.so.30.0
Reading symbols from /usr/lib/libz.so.5.0...done.
Loaded symbols for /usr/lib/libz.so.5.0
Reading symbols from /usr/X11R6/lib/libXau.so.10.0...done.
Loaded symbols for /usr/X11R6/lib/libXau.so.10.0
Reading symbols from /usr/X11R6/lib/libxshmfence.so.0.0...done.
Loaded symbols for /usr/X11R6/lib/libxshmfence.so.0.0
Reading symbols from /usr/X11R6/lib/libXdmcp.so.11.0...done.
Loaded symbols for /usr/X11R6/lib/libXdmcp.so.11.0
Reading symbols from /usr/lib/libkvm.so.17.0...done.
Loaded symbols for /usr/lib/libkvm.so.17.0
Reading symbols from /usr/lib/libm.so.10.1...done.
Loaded symbols for /usr/lib/libm.so.10.1
Reading symbols from /usr/lib/libc.so.96.0...done.
Loaded symbols for /usr/lib/libc.so.96.0
Reading symbols from /usr/libexec/ld.so...Error while reading shared library
symbols:
Dwarf Error: wrong version in compilation unit header (is 4, should be 2) [in
module /usr/libexec/ld.so]
Reading symbols from /usr/X11R6/lib/modules/drivers/modesetting_drv.so...done.
Loaded symbols for /usr/X11R6/lib/modules/drivers/modesetting_drv.so
Reading symbols from /usr/X11R6/lib/modules/libfb.so...done.
Loaded symbols for /usr/X11R6/lib/modules/libfb.so
Reading symbols from /usr/X11R6/lib/modules/libshadow.so...done.
Loaded symbols for /usr/X11R6/lib/modules/libshadow.so
Reading symbols from /usr/X11R6/lib/modules/input/kbd_drv.so...done.
Loaded symbols for /usr/X11R6/lib/modules/input/kbd_drv.so
Reading symbols from /usr/X11R6/lib/modules/input/ws_drv.so...done.
Loaded symbols for /usr/X11R6/lib/modules/input/ws_drv.so
[Switching to thread 113495]
_thread_sys_poll () at /tmp/-:3
3 /tmp/-: No such file or directory.
in /tmp/-
(gdb) bt
#0 _thread_sys_poll () at /tmp/-:3
#1 0x000002b3cdac8c6e in _libc_poll_cancel (fds=Unhandled dwarf expression
opcode 0xa3
) at /usr/src/lib/libc/sys/w_poll.c:27
#2 0x000002b14afd99cb in ospoll_wait () from /usr/X11R6/bin/Xorg
#3 0x000002b14afd739c in InputThreadDoWork () from /usr/X11R6/bin/Xorg
#4 0x000002b3f0b25471 in _rthread_start (v=Unhandled dwarf expression opcode
0xa3
) at /usr/src/lib/librthread/rthread.c:96
#5 0x000002b3cdb0e18a in __tfork_thread () at
/usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:84
#6 0x000002b3cdb0e18a in __tfork_thread () at
/usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:84
Previous frame identical to this frame (corrupt stack?)
Then
(gdb) list
84 call 0b
85
86 /*
87 * Thread exit system call
88 */
89 movl $SYS___threxit, %eax
90 xorl %edi, %edi
91 syscall
92 int3
93
Current language: auto; currently asm
(gdb) list
94 /*
95 * Branch here if the thread creation fails:
96 */
97 2:
98 SET_ERRNO
99 ret
100 .cfi_endproc
101 END(__tfork_thread)
(gdb) list
Line number 102 out of range; /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S
has 101 lines.
Program received signal SIGSTOP, Stopped (signal).
[Switching to thread 223487]
_thread_sys_poll () at /tmp/-:3
3 /tmp/-: No such file or directory.
in /tmp/-
(gdb) continue
Continuing.
Program received signal SIGSTOP, Stopped (signal).
[Switching to thread 223487]
_thread_sys_poll () at /tmp/-:3
3 /tmp/-: No such file or directory.
in /tmp/-
(gdb)
Continuing.
After which I don't get any new results from further continues, so I tried
^C
Program received signal SIGINT, Interrupt.
_thread_sys_poll () at /tmp/-:3
3 in /tmp/-
(gdb) print
The history is empty.
(gdb) list
1 in /tmp/-
(gdb) continue
Continuing.
^C
Program received signal SIGINT, Interrupt.
_thread_sys_poll () at /tmp/-:3
3 in /tmp/-
(gdb) bt
#0 _thread_sys_poll () at /tmp/-:3
#1 0x000002b3cdac8c6e in _libc_poll_cancel (fds=Unhandled dwarf expression
opcode 0xa3
) at /usr/src/lib/libc/sys/w_poll.c:27
#2 0x000002b14afd99cb in ospoll_wait () from /usr/X11R6/bin/Xorg
#3 0x000002b14afd1726 in WaitForSomething () from /usr/X11R6/bin/Xorg
#4 0x000002b14ae3338c in Dispatch () from /usr/X11R6/bin/Xorg
#5 0x000002b14ae3e5ec in dix_main () from /usr/X11R6/bin/Xorg
#6 0x000002b14ae25180 in _start () from /usr/X11R6/bin/Xorg
#7 0x0000000000000000 in ?? ()
(gdb)
I hope this is at least a little useful.
--
Peter N. M. Hansteen, member of the first RFC 1149 implementation team
http://bsdly.blogspot.com/ http://www.bsdly.net/ http://www.nuug.no/
"Remember to set the evil bit on all malicious network traffic"
delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.