Hi Wouter! Thanks for the detailed answer. This NSD instance was really idle. So, I suspect the upstream server was busy (the zone was for sure already available). Looks like I have to start tcpdump and wait until it happens again to have a proof.
Thanks Klaus -- Klaus Darilion, Head of Operations nic.at GmbH, Jakob-Haringer-Straße 8/V 5020 Salzburg, Austria > -----Ursprüngliche Nachricht----- > Von: Wouter Wijngaards <wou...@nlnetlabs.nl> > Gesendet: Donnerstag, 6. April 2023 11:25 > An: Klaus Darilion <klaus.daril...@nic.at>; nsd-users@lists.nlnetlabs.nl > Betreff: Re: [nsd-users] NSD slow fetching new zones > > Hi Klaus, > > The log message says the transfer happened at 00:19:28. That is about 15 > seconds after the 00:19:13 from the first command. That means it the > upstream connection only happened at that time. Something must have > delayed it. High verbosity can log more details if this is desired. What > could have happened is that the upstream server refused, or failed, to > answer, and the 15 second delay is the retry timeout. The value of 15 > seconds for that is fairly normal for that kind of thing, from SOA or > nsd configuration on retry timeout minimum and maximum. Without a SOA > record, since there is no zone loaded it logs, NSD also uses a value of > around 10 seconds for retries. So it could very likely be this kind of > thing. Perhaps the initial fetch is too early, the upstream server has > not actually loaded the zone in the way that AXFRs can get an answer > yet, and connections fail. A wait for that could maybe help with that, > so that an AXFR request is answered. Otherwise, I also do not see how it > would be delayed, possibly, NSD is so overloaded and suffers from a lack > of resource, that upstream connections for AXFR are delayed to get > resources. This can be improved by allowing more file descriptors to > xfrd, configurable in nsd.conf with xfrd-tcp-max. > > Best regards, Wouter > > On 4/3/23 09:54, Klaus Darilion via nsd-users wrote: > > Hello! > > > > We dynamically add secondary zones to NSD, and then test if NSD has > > fetched the zone (asking for the SOA). If the zone was not fetched > yet, > > we try to push NSD to transfer zone faster. But sometimes that does > not > > seem to work. For example in this case it took 15s to fetch the zone > > from the primary (running on localhost), although NSD was idle at this > > moment. > > > > 00:19:13 nsd[16628]: control cmd: addzone example.com > > slave_zone_from_local_pdns > > > > 00:19:13 nsd[16632]: zonefile slave_zone_from_local_pdns/example.com > > does not exist > > > > 00:19:14 nsd[16628]: new control connection from 127.0.0.1 > > > > 00:19:14 nsd[16628]: control cmd: force_transfer example.com > > > > 00:19:15 nsd[16628]: new control connection from 127.0.0.1 > > > > 00:19:15 nsd[16628]: control cmd: force_transfer example.com > > > > 00:19:16 nsd[16628]: new control connection from 127.0.0.1 > > > > 00:19:16 nsd[16628]: control cmd: force_transfer example.com > > > > 00:19:18 nsd[16628]: new control connection from 127.0.0.1 > > > > 00:19:18 nsd[16628]: control cmd: force_transfer example.com > > > > 00:19:19 nsd[16628]: new control connection from 127.0.0.1 > > > > 00:19:19 nsd[16628]: control cmd: force_transfer example.com > > > > 00:19:20 nsd[16628]: new control connection from 127.0.0.1 > > > > 00:19:20 nsd[16628]: control cmd: force_transfer example.com > > > > 00:19:21 nsd[16628]: new control connection from 127.0.0.1 > > > > 00:19:21 nsd[16628]: control cmd: force_transfer example.com > > > > 00:19:22 nsd[16628]: new control connection from 127.0.0.1 > > > > 00:19:22 nsd[16628]: control cmd: force_transfer example.com > > > > 00:19:23 nsd[16628]: new control connection from 127.0.0.1 > > > > 00:19:23 nsd[16628]: control cmd: force_transfer example.com > > > > 00:19:24 nsd[16628]: new control connection from 127.0.0.1 > > > > 00:19:24 nsd[16628]: control cmd: force_transfer example.com > > > > 00:19:28 nsd[16628]: xfrd: zone example.com committed "received update > > to serial 2018091101 at 2023-04-03T00:19:28 from 127.0.0.1@14018" > > > > 00:19:28 nsd[16632]: zone example.com. received update to serial > > 2018091101 at 2023-04-03T00:19:28 from 127.0.0.1@14018 of 417 bytes in > > 0.00071 seconds > > > > 00:19:28 nsd[16628]: zone example.com serial 0 is updated to > 2018091101 > > > > I can think of 2 possibilites: > > a) the primary name server is overloaded and does not accept AXFR > > requests immediately > > > > b) NSD has some timers that stop him from acting immediately > > > > Are there any such timers in NSD? Can you please give me some hints > how > > to debug this issue? > > > > Thanks > > > > Klaus > > > > > > _______________________________________________ > > nsd-users mailing list > > nsd-users@lists.nlnetlabs.nl > > https://lists.nlnetlabs.nl/mailman/listinfo/nsd-users _______________________________________________ nsd-users mailing list nsd-users@lists.nlnetlabs.nl https://lists.nlnetlabs.nl/mailman/listinfo/nsd-users