> On 18. Jun 2020, at 09:47, Otto Moerbeek <[email protected]> wrote: > On Wed, Jun 17, 2020 at 04:57:54PM +0200, Joerg Jung wrote: >>> On 17. Feb 2020, at 15:16, Martin Pieuchot <[email protected]> wrote: >>> On 17/02/20(Mon) 14:55, Joerg Jung wrote: >>>> >>>>> On 26. Sep 2019, at 15:02, Stuart Henderson <[email protected]> wrote: >>>>> On 2019/09/26 13:45, Stuart Henderson wrote: >>>>>> On 2019/09/26 11:16, Joerg Jung wrote: >>>>>>> >>>>>>> >>>>>>> I run a few busy (~800 req/s) NSD servers which I upgraded >>>>>>> to 6.5, all stock/default OpenBSD, e.g. I’ve not tweaked any >>>>>>> sysctl values and nsd.conf matches the default as well, just >>>>>>> added a few hundred zones. >>>>>>> >>>>>>> Now, when I increase servers from default 1 to 2 in nsd.conf: >>>>>>> server-count: 2 >>>>>>> it starts spamming my log with: >>>>>>> nsd[62723]: sendto 1.2.3.4 failed: Resource temporarily >>>>>>> unavailable >>>>>>> >> >> FYI, after upgrade to 6.7 the issue is still the same, >> just with different format for the syslog message not >> explicitly mentioning sendto() anymore, the new error >> log repeating gazillion of times looks like this: >> nsd[55919]: sendmmsg [0]=1.2.3.4 count=1 failed: Resource temporarily >> unavailable >> >>> Did you ktrace(1) the problem? >> >> Yes, please see below. >> >>> How is sendto(2) called, in particular >>> is there any MSG_DONTWAIT or FNONBLOCK set on the file descriptor? >> >> From the ktrace dump it seems not sendto() being the culprit, but recvfrom(). >> Neither, sendto() nor recvfrom() seem to have any flags set. > > I think this is an error in your analysis. It's perfectly normal for > recvform to return EAGAIN.
Yes, the log message coming from sendto() / sendmmsg() was misleading here. > On the sending side it indicates an issue. Yes, as verified with ktrace the sendto() works fine. However, it’s not normal to write 200 log messages per second for recvfrom() EAGAIN. So, seems like a (logging) bug in NSD: It looks like the sendmmsg() wrapper logs the earlier stored EAGAIN from resvmmsg() wrapper — if verbosity is 1, which is OpenBSD default. Besides, I believe (reading man pages) these network errors should be only logged at a higher verbosity 2 anyways. Unfortunately, I’m not able to open a bug report in bugzilla upstream anymore, probably because they moved things to Github nowadays. >> However, this one seems related: >> https://www.nlnetlabs.nl/bugs-script/show_bug.cgi?id=385 >> <https://www.nlnetlabs.nl/bugs-script/show_bug.cgi?id=385><https://www.nlnetlabs.nl/bugs-script/show_bug.cgi?id=385 >> <https://www.nlnetlabs.nl/bugs-script/show_bug.cgi?id=385>> >> >> >> >> 52438 nsd STRU struct kevent { ident=5, filter=EVFILT_READ, >> flags=0x1<EV_ADD>, fflags=0<>, data=62 >> , udata=0x30562fe4198 } >> 55919 nsd STRU struct kevent { ident=5, filter=EVFILT_READ, >> flags=0x1<EV_ADD>, fflags=0<>, data=62 >> , udata=0x30562fe4198 } >> 52438 nsd RET kevent 1 >> 55919 nsd RET kevent 1 >> 52438 nsd CALL clock_gettime(CLOCK_MONOTONIC,0x7f7ffffbf1a0) >> 55919 nsd CALL clock_gettime(CLOCK_MONOTONIC,0x7f7ffffbf1a0) >> 52438 nsd STRU struct timespec { 1224.307942259 } >> 55919 nsd STRU struct timespec { 1224.307946954 } >> 52438 nsd RET clock_gettime 0 >> 55919 nsd RET clock_gettime 0 >> 52438 nsd CALL >> recvfrom(5,0x30467ecd000,0x20109,0,0x304e553dc18,0x302669c2aa8) >> 55919 nsd CALL >> recvfrom(5,0x30497a9d000,0x20109,0,0x304e553dc18,0x302669c2aa8) >> 52438 nsd GIO fd 5 read 46 bytes >> >> "\aY\0\^P\0\^A\0\0\0\0\0\^A\^Dwpad\^Efoo\^Fbar\0\0\^\\0\^A\0\0)\^P\0\0\0\M^@\0\0\0" >> 55919 nsd RET recvfrom -1 errno 35 Resource temporarily unavailable >> 52438 nsd STRU struct sockaddr { AF_INET, 1.2.3.4:40276 } >> 55919 nsd CALL clock_gettime(CLOCK_MONOTONIC,0x7f7ffffbf1a0) >> 52438 nsd RET recvfrom 46/0x2e >> 55919 nsd STRU struct timespec { 1224.307967366 } >> 55919 nsd RET clock_gettime 0 >> 52438 nsd CALL gettimeofday(0x7f7ffffbeed0,0) >> 55919 nsd CALL clock_gettime(CLOCK_MONOTONIC,0x7f7ffffbf1a0) >> 52438 nsd STRU struct timeval { 1592402542<"Jun 17 10:02:22 >> 2020">.287233 } >> 55919 nsd STRU struct timespec { 1224.307977172 } >> 52438 nsd RET gettimeofday 0 >> 55919 nsd RET clock_gettime 0 >> 52438 nsd CALL sendto(5,0x30467ecd000,0x2e,0,0x304e553dc18,0x10) >> 55919 nsd CALL kevent(11,0,0,0x30549983000,64,0x7f7ffffbf0f8) >> 52438 nsd STRU struct sockaddr { AF_INET, 1.2.3.4:40276 } >> 55919 nsd STRU struct timespec { 109.487449000 } >> 52438 nsd GIO fd 5 wrote 46 bytes >> >> "\aY\M^@\^E\0\^A\0\0\0\0\0\^A\^Dwpad\^foo\^Fbar\0\0\^\\0\^A\0\0)\^P\0\0\0\M^@\0\0\0" >> 52438 nsd RET sendto 46/0x2e
