Hi Klaus,

The log message says the transfer happened at 00:19:28. That is about 15 seconds after the 00:19:13 from the first command. That means it the upstream connection only happened at that time. Something must have delayed it. High verbosity can log more details if this is desired. What could have happened is that the upstream server refused, or failed, to answer, and the 15 second delay is the retry timeout. The value of 15 seconds for that is fairly normal for that kind of thing, from SOA or nsd configuration on retry timeout minimum and maximum. Without a SOA record, since there is no zone loaded it logs, NSD also uses a value of around 10 seconds for retries. So it could very likely be this kind of thing. Perhaps the initial fetch is too early, the upstream server has not actually loaded the zone in the way that AXFRs can get an answer yet, and connections fail. A wait for that could maybe help with that, so that an AXFR request is answered. Otherwise, I also do not see how it would be delayed, possibly, NSD is so overloaded and suffers from a lack of resource, that upstream connections for AXFR are delayed to get resources. This can be improved by allowing more file descriptors to xfrd, configurable in nsd.conf with xfrd-tcp-max.

Best regards, Wouter

On 4/3/23 09:54, Klaus Darilion via nsd-users wrote:
Hello!

We dynamically add secondary zones to NSD, and then test if NSD has fetched the zone (asking for the SOA). If the zone was not fetched yet, we try to push NSD to transfer zone faster. But sometimes that does not seem to work. For example in this case it took 15s to fetch the zone from the primary (running on localhost), although NSD was idle at this moment.

00:19:13 nsd[16628]: control cmd:  addzone example.com slave_zone_from_local_pdns

00:19:13 nsd[16632]: zonefile slave_zone_from_local_pdns/example.com does not exist

00:19:14 nsd[16628]: new control connection from 127.0.0.1

00:19:14 nsd[16628]: control cmd:  force_transfer example.com

00:19:15 nsd[16628]: new control connection from 127.0.0.1

00:19:15 nsd[16628]: control cmd:  force_transfer example.com

00:19:16 nsd[16628]: new control connection from 127.0.0.1

00:19:16 nsd[16628]: control cmd:  force_transfer example.com

00:19:18 nsd[16628]: new control connection from 127.0.0.1

00:19:18 nsd[16628]: control cmd:  force_transfer example.com

00:19:19 nsd[16628]: new control connection from 127.0.0.1

00:19:19 nsd[16628]: control cmd:  force_transfer example.com

00:19:20 nsd[16628]: new control connection from 127.0.0.1

00:19:20 nsd[16628]: control cmd:  force_transfer example.com

00:19:21 nsd[16628]: new control connection from 127.0.0.1

00:19:21 nsd[16628]: control cmd:  force_transfer example.com

00:19:22 nsd[16628]: new control connection from 127.0.0.1

00:19:22 nsd[16628]: control cmd:  force_transfer example.com

00:19:23 nsd[16628]: new control connection from 127.0.0.1

00:19:23 nsd[16628]: control cmd:  force_transfer example.com

00:19:24 nsd[16628]: new control connection from 127.0.0.1

00:19:24 nsd[16628]: control cmd:  force_transfer example.com

00:19:28 nsd[16628]: xfrd: zone example.com committed "received update to serial 2018091101 at 2023-04-03T00:19:28 from 127.0.0.1@14018"

00:19:28 nsd[16632]: zone example.com. received update to serial 2018091101 at 2023-04-03T00:19:28 from 127.0.0.1@14018 of 417 bytes in 0.00071 seconds

00:19:28 nsd[16628]: zone example.com serial 0 is updated to 2018091101

I can think of 2 possibilites:
a) the primary name server is overloaded and does not accept AXFR requests immediately

b) NSD has some timers that stop him from acting immediately

Are there any such timers in NSD? Can you please give me some hints how to debug this issue?

Thanks

Klaus


_______________________________________________
nsd-users mailing list
nsd-users@lists.nlnetlabs.nl
https://lists.nlnetlabs.nl/mailman/listinfo/nsd-users
_______________________________________________
nsd-users mailing list
nsd-users@lists.nlnetlabs.nl
https://lists.nlnetlabs.nl/mailman/listinfo/nsd-users

Reply via email to