I've been finding egdb and gdb rather easily get stuck in an
uninterruptible wait, e.g. when running the 'next' command after
hitting a breakpoint.

So it's not possible to kill the debuggee or gdb and the only way to
kill the debuggee process and free up its listening sockets seems to be
to reboot the entire system.

Perhaps unsurprisingly one cannot attach a second invocation of gdb to
the uninterruptible gdb, so i don't know for sure what syscall is being
run that is getting stuck.

The debuggee is a local build of the flightgear flight simulator.

Here's the output of ps for the debugger and debuggee:

12419 p0  D        0:34.37 egdb -ex handle SIGPIPE noprint nostop -ex set print 
thread-events off -ex set print pretty on -ex run --args 
build-walk/fgfs,clang,debug,opt,co
63921 p0  TX+      0:42.45 
/home/jules/flightgear/build-walk/fgfs,clang,debug,opt,compositor,osg.exe 
--airport=egtk (fgfs,clang,debug)

I've tried using ktrace on egdb, and the kdump output ends like this:

 53950 egdb     CALL  wait4(WAIT_ANY,0x7f7ffffe8efc,0<>,0)
 53950 egdb     RET   wait4 97562/0x17d1a
 53950 egdb     CALL  ptrace(PT_GET_PROCESS_STATE,97562,0x7f7ffffe8ef0,12)
 53950 egdb     RET   ptrace 0
 53950 egdb     CALL  ptrace(PT_GETREGS,161560,0x7f7ffffe8b40,0)
 53950 egdb     RET   ptrace 0
 53950 egdb     CALL  
futex(0x6444e37c490,0x82<FUTEX_WAKE|FUTEX_PRIVATE_FLAG>,1,0,0)
 53950 egdb     RET   futex 0
 53950 egdb     CALL  
futex(0x644bef12740,0x82<FUTEX_WAKE|FUTEX_PRIVATE_FLAG>,1,0,0)
 53950 egdb     RET   futex 0
 53950 egdb     CALL  ptrace(PT_IO,97562,0x7f7ffffe8a30,0)
 53950 egdb     RET   ptrace 0
 53950 egdb     CALL  ptrace(PT_IO,97562,0x7f7ffffe8a30,0)
 53950 egdb     RET   ptrace 0
 53950 egdb     CALL  ptrace(PT_STEP,97562,0x1,0)
 53950 egdb     RET   ptrace 0
 53950 egdb     CALL  read(6,0x7f7ffffe9187,0x1)
 53950 egdb     RET   read -1 errno 35 Resource temporarily unavailable
 53950 egdb     CALL  poll(0x6441581e720,3,0)
 53950 egdb     STRU  struct pollfd [3] { fd=4, events=0x1<POLLIN>, revents=0<> 
} { fd=6, events=0x1<POLLIN>, revents=0<> } { fd=10, events=0x1<POLLIN>, 
revents=0<> }
 53950 egdb     RET   poll 0
 53950 egdb     CALL  wait4(WAIT_ANY,0x7f7ffffe8efc,0<>,0)

Assuming that this is the actual end of the ktrace output and there
isn't some missing ktrace output in a buffer somewhere, this looks
like egdb is simply blocked in wait4(), which should be harmless and
certainly not uninterruptable?

Does anyone have any suggestions about how to investigate this further?

I'm running OpenBSD 6.7 GENERIC.MP#182 amd64.

Thanks,

- Jules

-- 
http://op59.net

Reply via email to