In message <prayer.1.3.2.0909262248400.24...@hermes-1.csi.cam.ac.uk>, Chris Tho mpson writes: > Back in August there was some a thread on bind-users about messages > of the shape > > validating @[hex]: [name].dlv.isc.org DS: must be secure failure > > (these are category "dnssec" severity "warning") and on 31 August I wrote: > > >We have been running two production recursive nameservers validating against > >dlv.isc.org since 9 June, and first saw a batch of messages (for both server > s) > >like this on 20 July. We reported them to ISC and got suggestions along the > >lines of Mark's above, along with an admission that current versions of BIND > >give up on EDNS too easily in situations they maybe shouldn't, which may be > >fixed in future releases. > > > >Since then we have had a trickle of such warning messages in the logs. We > >assume that they are the result of temporary network glitches somewhere, > >but their frequency appears to be increasing, which is somewhat worrying. > >It's also not clear whether any client queries are actually failing as a > >result, or whether BIND is simply trying another dlv.isc.org nameserver > >with better luck. > > I have been looking at this again, and in fact there was a step function > on 21 August when the messages rose from almost nil to 15-20 per day, and > then fell back to almost nil after 15 September (we've seen just one since > then). We have been running BIND 9.6.1-P1 throughout. > > I would be very interested to know whether other recursive nameserver > operators validating via dlv.isc.org have seen a similar pattern. I am > prepared to believe that the frequency is related to transient network > errors or delays, but I have no idea whether they are likely to be local > or at at the dlv.isc.org server end.
One gets these or similar messages when named falls back to plain DNS as a result of multiple timeouts. Named tries EDNS advertising a 4096 byte UDP buffer, then after multiple timeouts it tries EDNS advertising a 512 byte UDP buffer, then after multiple timeouts it tries plain DNS. Named also had a bug where it would fallback a EDNS step when it didn't need to (like retrying w/ TCP). This made DNSSEC behind middleware that was dropping fragments difficult. 2564. [bug] Only take EDNS fallback steps when processing timeouts. [RT #19405] Some (perhaps not all) of the timeout causes are below. This list is not specific to DLV. (apparent) non responses to UDP queries can be due to lots of causes: *+ Firewalls/middleware that blocks DNS responses > 512 *+ Firewalls/middleware that blocks fragments *+ Lack of support for out of order responses in NAT *+ Responses that require fragmentation but DF set. Most of these will be in the 1481-1500 bytes in size (IP in IP tunnels). Larger responses are usually fragmented by the sending OS and don't have DF set. Smaller response make it through a single layer of encapsulation. *+# Bad nameserver software that fails to respond to EDNS requests *+# Firewalls/proxies that block EDNS queries or queries/responses with one or more of DO, CD or AD set. * Congestion * Packet corruption * Appear lost due to long rtt times - load balancing probes taking too long - multiple satellite links - significant congestion causing long delays + indicates broken software # indicates fallback to plain DNS will be required A handful a day would suggest packet corruption/congestion as the likely cause. Mark -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org _______________________________________________ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users