On Thu, Jun 18, 2020 at 11:45:01AM +0200, Joerg Jung wrote:

> 
> > On 18. Jun 2020, at 09:47, Otto Moerbeek <[email protected]> wrote:
> > On Wed, Jun 17, 2020 at 04:57:54PM +0200, Joerg Jung wrote:
> >>> On 17. Feb 2020, at 15:16, Martin Pieuchot <[email protected]> wrote:
> >>> On 17/02/20(Mon) 14:55, Joerg Jung wrote:
> >>>> 
> >>>>> On 26. Sep 2019, at 15:02, Stuart Henderson <[email protected]> 
> >>>>> wrote:
> >>>>> On 2019/09/26 13:45, Stuart Henderson wrote:
> >>>>>> On 2019/09/26 11:16, Joerg Jung wrote:
> >>>>>>> 
> >>>>>>> 
> >>>>>>> I run a few busy (~800 req/s) NSD servers which I upgraded 
> >>>>>>> to 6.5, all stock/default OpenBSD, e.g. I’ve not tweaked any 
> >>>>>>> sysctl values and nsd.conf matches the default as well, just 
> >>>>>>> added a few hundred zones.
> >>>>>>> 
> >>>>>>> Now, when I increase servers from default 1 to 2 in nsd.conf: 
> >>>>>>>       server-count: 2
> >>>>>>> it starts spamming my log with:
> >>>>>>>       nsd[62723]: sendto 1.2.3.4 failed: Resource temporarily 
> >>>>>>> unavailable
> >>>>>>> 
> >> 
> >> FYI, after upgrade to 6.7 the issue is still the same, 
> >> just with different format for the syslog message not 
> >> explicitly mentioning sendto() anymore, the new error
> >> log repeating gazillion of times looks like this:
> >>    nsd[55919]: sendmmsg [0]=1.2.3.4 count=1 failed: Resource temporarily 
> >> unavailable
> >> 
> >>> Did you ktrace(1) the problem?
> >> 
> >> Yes, please see below.
> >> 
> >>> How is sendto(2) called, in particular
> >>> is there any MSG_DONTWAIT or FNONBLOCK set on the file descriptor?  
> >> 
> >> From the ktrace dump it seems not sendto() being the culprit, but 
> >> recvfrom().
> >> Neither, sendto() nor recvfrom() seem to have any flags set.
> > 
> > I think this is an error in your analysis. It's perfectly normal for
> > recvform to return EAGAIN.
> 
> Yes, the log message coming from sendto() / sendmmsg() was 
> misleading here.
> 
> > On the sending side it indicates an issue. 
> 
> Yes, as verified with ktrace the sendto() works fine.
> 
> However, it’s not normal to write 200 log messages per second for
> recvfrom() EAGAIN.
> 
> So, seems like a (logging) bug in NSD: It looks like the sendmmsg() 
> wrapper logs the earlier stored EAGAIN from resvmmsg() wrapper
>  — if verbosity is 1, which is OpenBSD default.
> 
> Besides, I believe (reading man pages) these network errors 
> should be only logged at a higher verbosity 2 anyways.
> 
> Unfortunately, I’m not able to open a bug report in bugzilla upstream 
> anymore, probably because they moved things to Github nowadays.

Well, you might consider the obvious...

        -Otto

> 
> 
> >> However, this one seems related:
> >> https://www.nlnetlabs.nl/bugs-script/show_bug.cgi?id=385 
> >> <https://www.nlnetlabs.nl/bugs-script/show_bug.cgi?id=385><https://www.nlnetlabs.nl/bugs-script/show_bug.cgi?id=385
> >>  <https://www.nlnetlabs.nl/bugs-script/show_bug.cgi?id=385>>
> >> 
> >> 
> >> 
> >> 52438 nsd      STRU  struct kevent { ident=5, filter=EVFILT_READ, 
> >> flags=0x1<EV_ADD>, fflags=0<>, data=62
> >> , udata=0x30562fe4198 }
> >> 55919 nsd      STRU  struct kevent { ident=5, filter=EVFILT_READ, 
> >> flags=0x1<EV_ADD>, fflags=0<>, data=62
> >> , udata=0x30562fe4198 }
> >> 52438 nsd      RET   kevent 1
> >> 55919 nsd      RET   kevent 1
> >> 52438 nsd      CALL  clock_gettime(CLOCK_MONOTONIC,0x7f7ffffbf1a0)
> >> 55919 nsd      CALL  clock_gettime(CLOCK_MONOTONIC,0x7f7ffffbf1a0)
> >> 52438 nsd      STRU  struct timespec { 1224.307942259 }
> >> 55919 nsd      STRU  struct timespec { 1224.307946954 }
> >> 52438 nsd      RET   clock_gettime 0
> >> 55919 nsd      RET   clock_gettime 0
> >> 52438 nsd      CALL 
> >> recvfrom(5,0x30467ecd000,0x20109,0,0x304e553dc18,0x302669c2aa8)
> >> 55919 nsd      CALL 
> >> recvfrom(5,0x30497a9d000,0x20109,0,0x304e553dc18,0x302669c2aa8)
> >> 52438 nsd      GIO   fd 5 read 46 bytes
> >>       
> >> "\aY\0\^P\0\^A\0\0\0\0\0\^A\^Dwpad\^Efoo\^Fbar\0\0\^\\0\^A\0\0)\^P\0\0\0\M^@\0\0\0"
> >> 55919 nsd      RET   recvfrom -1 errno 35 Resource temporarily unavailable
> >> 52438 nsd      STRU  struct sockaddr { AF_INET, 1.2.3.4:40276 }
> >> 55919 nsd      CALL  clock_gettime(CLOCK_MONOTONIC,0x7f7ffffbf1a0)
> >> 52438 nsd      RET   recvfrom 46/0x2e
> >> 55919 nsd      STRU  struct timespec { 1224.307967366 }
> >> 55919 nsd      RET   clock_gettime 0
> >> 52438 nsd      CALL  gettimeofday(0x7f7ffffbeed0,0)
> >> 55919 nsd      CALL  clock_gettime(CLOCK_MONOTONIC,0x7f7ffffbf1a0)
> >> 52438 nsd      STRU  struct timeval { 1592402542<"Jun 17 10:02:22 
> >> 2020">.287233 }
> >> 55919 nsd      STRU  struct timespec { 1224.307977172 }
> >> 52438 nsd      RET   gettimeofday 0
> >> 55919 nsd      RET   clock_gettime 0
> >> 52438 nsd      CALL  sendto(5,0x30467ecd000,0x2e,0,0x304e553dc18,0x10)
> >> 55919 nsd      CALL  kevent(11,0,0,0x30549983000,64,0x7f7ffffbf0f8)
> >> 52438 nsd      STRU  struct sockaddr { AF_INET, 1.2.3.4:40276 }
> >> 55919 nsd      STRU  struct timespec { 109.487449000 }
> >> 52438 nsd      GIO   fd 5 wrote 46 bytes
> >>       
> >> "\aY\M^@\^E\0\^A\0\0\0\0\0\^A\^Dwpad\^foo\^Fbar\0\0\^\\0\^A\0\0)\^P\0\0\0\M^@\0\0\0"
> >> 52438 nsd      RET   sendto 46/0x2e
> 

Reply via email to