Hi Willem,

Indeed, NSD 4.8.0 did not log this condition as an error message and just proceeded if the old-main would quit.

With 4.9.0 reloading was refactored to reap exited old serve childs in order to reduce the number of "defunct" or "zombie" processes that can emerge (for example when one old-serve child is still busy, for example serving an AXFR or so).

When old-main is done with is job during reload (killing the old serve children), it informs the reload process and then immediately exists. The detection of the closed pipe (because of exited old-main) could very well become before the information that old-main is done on that pipe on some platforms.

Thank you for the fantastic explanation! The graphic is also very helpful for understanding the logic flow.

So I consider this "reporting of an exited old-main" at this point in the code a bug, and changed it into a debugging warning level message here: https://github.com/NLnetLabs/nsd/pull/421

PS. For completeness a strip of a successful NSD reload below. Your issue would occur if old-main(5) would exit before the load(2) process received the NSD_RELOAD in picture 6.NSD successful reload

That makes sense given the previous information you provided. I look forward to running a future NSD version with that patch :-).

Cheers,
Otto


_______________________________________________
nsd-users mailing list
nsd-users@lists.nlnetlabs.nl
https://lists.nlnetlabs.nl/mailman/listinfo/nsd-users

Reply via email to