On Fri, Oct 15, 2021 at 06:18:12PM +0200, Mischa wrote:
> Hi All,
>
> Thank you all very much for OpenBSD 7.0!
> All upgrades went as smooth as always.
>
> However when I upgraded my DNS VMs, NSD keeps exiting with status 11.
> Unfortunately even in debug mode with -V4-9 it only gives me the below
> output.
>
> [2021-10-15 18:05:10.995] nsd[32203]: warning: server 70341 died
> unexpectedly with status 11, restarting
> [2021-10-15 18:05:11.246] nsd[32203]: warning: server 96047 died
> unexpectedly with status 11, restarting
> etc.. etc..
>
> With ktrace I was able to, hopefully, capture something useful.
>
> 36104 nsd RET write 1
> 36104 nsd CALL socketpair(AF_UNIX,0x1<SOCK_STREAM>,0,0x7f7ffffd6a48)
> 36104 nsd STRU int [2] { 16, 17 }
> 36104 nsd RET socketpair 0
> 36104 nsd CALL fork()
> 36104 nsd RET fork 24892/0x613c
> 36104 nsd CALL close(17)
> 36104 nsd RET close 0
> 36104 nsd CALL fcntl(16,F_SETFL,0x4<O_NONBLOCK>)
> 36104 nsd RET fcntl 0
> 36104 nsd CALL wait4(WAIT_ANY,0x7f7ffffd74d8,0x1<WNOHANG>,0)
> 36104 nsd RET wait4 0
> 36104 nsd CALL ppoll(0xb27e4857cb0,2,0x7f7ffffd6a30,0)
> 36104 nsd STRU struct timespec { 60 }
> 36104 nsd STRU struct pollfd [2] { fd=16, events=0x1<POLLIN>,
> revents=0<> } { fd=15, events=0x1<POLLIN>, revents=0<> }
> 36104 nsd PSIG SIGCHLD caught handler=0xb27e47fa340 mask=0<>
> 36104 nsd RET ppoll -1 errno 4 Interrupted system call
> 36104 nsd CALL sigreturn(0x7f7ffffd6510)
> 36104 nsd RET sigreturn JUSTRETURN
> 36104 nsd CALL wait4(WAIT_ANY,0x7f7ffffd74d8,0x1<WNOHANG>,0)
> 36104 nsd RET wait4 24892/0x613c
> 36104 nsd CALL close(16)
> 36104 nsd RET close 0
> 36104 nsd CALL getpid()
> 36104 nsd RET getpid 36104/0x8d08
> 36104 nsd CALL sendsyslog(0x7f7ffffd4110,73,0<>)
> 36104 nsd GIO fd -1 wrote 73 bytes
> "<28>nsd[36104]: server 24892 died unexpectedly with status 11,
> restarting"
> 36104 nsd RET sendsyslog 0
> 36104 nsd CALL gettimeofday(0x7f7ffffd6748,0)
> 36104 nsd STRU struct timeval { 1634313545<"Oct 15 17:59:05
> 2021">.305661 }
> 36104 nsd RET gettimeofday 0
> 36104 nsd CALL getpid()
> 36104 nsd RET getpid 36104/0x8d08
> 36104 nsd CALL write(2,0x7f7ffffd5e20,0x68)
> 36104 nsd GIO fd 2 wrote 104 bytes
> "[2021-10-15 17:59:05.305] nsd[36104]: warning: server 24892 died
> unexpectedly with status 11, restarting"
> 36104 nsd RET write 104/0x68
> 36104 nsd CALL write(2,0xb2aa1098927,0x1)
> 36104 nsd GIO fd 2 wrote 1 bytes
>
> If someone has any ideas that would be great.
>
> Mischa
>
The actual problem (SIGSEGV) happens in the child processes: ktrace the
children as well: ktrace -di ...
-Otto