On 17/02/20(Mon) 14:55, Joerg Jung wrote:
>
> > On 26. Sep 2019, at 15:02, Stuart Henderson <[email protected]> wrote:
> > On 2019/09/26 13:45, Stuart Henderson wrote:
> >> On 2019/09/26 11:16, Joerg Jung wrote:
> >>> Hi,
> >>>
> >>> I run a few busy (~800 req/s) NSD servers which I upgraded
> >>> to 6.5, all stock/default OpenBSD, e.g. I’ve not tweaked any
> >>> sysctl values and nsd.conf matches the default as well, just
> >>> added a few hundred zones.
> >>>
> >>> Now, when I increase servers from default 1 to 2 in nsd.conf:
> >>> server-count: 2
> >>> it starts spamming my log with:
> >>> nsd[62723]: sendto 1.2.3.4 failed: Resource temporarily unavailable
> >>>
> >>> checking the source, server.c seems not to handle EAGAIN
> >>> after sendto() and does not recover or retry, it just increases
> >>> txerr statistic count - so answer seems really lost :(
> >>>
> >>> I tried higher debug level, as well as increasing socket buffers to:
> >>> net.inet.udp.recvspace= 65536
> >>> net.inet.udp.sendspace=65636
> >>> but both didn’t help and netstat -s -p udp does show
> >>> 0 dropped due to full socket buffers
> >>> anyways. So, I don’t believe this is a socket buffer issue.
> >>>
> >>> The same server-count: 2 setting worked fine with 6.3.
> >>>
> >>> Any hints, insights, or pointers?
> >>> Does anyone else experience the same?
> >>>
> >>> Thanks,
> >>> Regards,
> >>> Joerg
> >>
> >> Maybe it's worth trying to track down further whether this is due to an
> >> NSD change or something else in the OS - cvs up -r OPENBSD_6_3 .. (be sure
> >> to use "make -f Makefile.bsd-wrapper [..]" when building).
> >>
> >
> > Or, following a comment from claudio@, try a kernel built with this:
>
> FYI, I tried that diff and a few other things but neither did help.
Did you ktrace(1) the problem? How is sendto(2) called, in particular
is there any MSG_DONTWAIT or FNONBLOCK set on the file descriptor? Does
that mean the kernel returns EWOULDBLOCK even if the userland said it is
fine to block?
>
> > Index: syscalls.master
> > ===================================================================
> > RCS file: /cvs/src/sys/kern/syscalls.master,v
> > retrieving revision 1.189
> > diff -u -p -r1.189 syscalls.master
> > --- syscalls.master 11 Jan 2019 18:46:30 -0000 1.189
> > +++ syscalls.master 26 Sep 2019 13:01:46 -0000
> > @@ -261,7 +261,7 @@
> > 130 OBSOL oftruncate
> > 131 STD { int sys_flock(int fd, int how); }
> > 132 STD { int sys_mkfifo(const char *path, mode_t mode); }
> > -133 STD NOLOCK { ssize_t sys_sendto(int s, const void *buf, \
> > +133 STD { ssize_t sys_sendto(int s, const void *buf, \
> > size_t len, int flags, const struct sockaddr *to, \
> > socklen_t tolen); }
> > 134 STD { int sys_shutdown(int s, int how); }
> >
> >
> > Run "make syscalls" in sys/kern before building.
>