I've been finding egdb and gdb rather easily get stuck in an
uninterruptible wait, e.g. when running the 'next' command after
hitting a breakpoint.
So it's not possible to kill the debuggee or gdb and the only way to
kill the debuggee process and free up its listening sockets seems to be
to reboot the entire system.
Perhaps unsurprisingly one cannot attach a second invocation of gdb to
the uninterruptible gdb, so i don't know for sure what syscall is being
run that is getting stuck.
The debuggee is a local build of the flightgear flight simulator.
Here's the output of ps for the debugger and debuggee:
12419 p0 D 0:34.37 egdb -ex handle SIGPIPE noprint nostop -ex set print
thread-events off -ex set print pretty on -ex run --args
build-walk/fgfs,clang,debug,opt,co
63921 p0 TX+ 0:42.45
/home/jules/flightgear/build-walk/fgfs,clang,debug,opt,compositor,osg.exe
--airport=egtk (fgfs,clang,debug)
I've tried using ktrace on egdb, and the kdump output ends like this:
53950 egdb CALL wait4(WAIT_ANY,0x7f7ffffe8efc,0<>,0)
53950 egdb RET wait4 97562/0x17d1a
53950 egdb CALL ptrace(PT_GET_PROCESS_STATE,97562,0x7f7ffffe8ef0,12)
53950 egdb RET ptrace 0
53950 egdb CALL ptrace(PT_GETREGS,161560,0x7f7ffffe8b40,0)
53950 egdb RET ptrace 0
53950 egdb CALL
futex(0x6444e37c490,0x82<FUTEX_WAKE|FUTEX_PRIVATE_FLAG>,1,0,0)
53950 egdb RET futex 0
53950 egdb CALL
futex(0x644bef12740,0x82<FUTEX_WAKE|FUTEX_PRIVATE_FLAG>,1,0,0)
53950 egdb RET futex 0
53950 egdb CALL ptrace(PT_IO,97562,0x7f7ffffe8a30,0)
53950 egdb RET ptrace 0
53950 egdb CALL ptrace(PT_IO,97562,0x7f7ffffe8a30,0)
53950 egdb RET ptrace 0
53950 egdb CALL ptrace(PT_STEP,97562,0x1,0)
53950 egdb RET ptrace 0
53950 egdb CALL read(6,0x7f7ffffe9187,0x1)
53950 egdb RET read -1 errno 35 Resource temporarily unavailable
53950 egdb CALL poll(0x6441581e720,3,0)
53950 egdb STRU struct pollfd [3] { fd=4, events=0x1<POLLIN>, revents=0<>
} { fd=6, events=0x1<POLLIN>, revents=0<> } { fd=10, events=0x1<POLLIN>,
revents=0<> }
53950 egdb RET poll 0
53950 egdb CALL wait4(WAIT_ANY,0x7f7ffffe8efc,0<>,0)
Assuming that this is the actual end of the ktrace output and there
isn't some missing ktrace output in a buffer somewhere, this looks
like egdb is simply blocked in wait4(), which should be harmless and
certainly not uninterruptable?
Does anyone have any suggestions about how to investigate this further?
I'm running OpenBSD 6.7 GENERIC.MP#182 amd64.
Thanks,
- Jules
--
http://op59.net