We have a couple of recursive servers running 9.9.5 which are persistently unable to validate answers.ssh.com, returning SERVFAIL. With debug logging turned on we get (amongst lots of other things):
24-Apr-2014 16:41:23.087 client 131.111.56.28#35569 (answers.ssh.com): query (cache) 'answers.ssh.com/A/IN' approved 24-Apr-2014 16:41:23.087 client 131.111.56.28#35569 (answers.ssh.com): replace 24-Apr-2014 16:41:23.127 validating @2e4e75b8: answers.ssh.com A: starting 24-Apr-2014 16:41:23.127 validating @2e4e75b8: answers.ssh.com A: attempting insecurity proof 24-Apr-2014 16:41:23.127 validating @2e4e75b8: answers.ssh.com A: checking existence of DS at 'com' 24-Apr-2014 16:41:23.127 validating @2e4e75b8: answers.ssh.com A: checking existence of DS at 'ssh.com' 24-Apr-2014 16:41:24.114 validating @252fd3f0: ssh.com DS: starting 24-Apr-2014 16:41:24.114 validating @252fd3f0: ssh.com DS: attempting positive response validation 24-Apr-2014 16:41:24.114 validating @252fd3f0: ssh.com DS: keyset with trust secure 24-Apr-2014 16:41:24.114 validating @252fd3f0: ssh.com DS: verify rdataset (keyid=56657): success 24-Apr-2014 16:41:24.114 validating @252fd3f0: ssh.com DS: marking as secure, noqname proof not needed 24-Apr-2014 16:41:24.115 validating @2e4e75b8: answers.ssh.com A: in dsfetched2: success 24-Apr-2014 16:41:24.115 validating @2e4e75b8: answers.ssh.com A: resuming proveunsecure 24-Apr-2014 16:41:24.115 validating @2e4e75b8: answers.ssh.com A: checking existence of DS at 'answers.ssh.com' 24-Apr-2014 16:41:24.115 validating @2e4e75b8: answers.ssh.com A: bad cache hit (answers.ssh.com/DS) 24-Apr-2014 16:41:24.115 error (broken trust chain) resolving 'answers.ssh.com/A/IN': 208.109.255.50#53 24-Apr-2014 16:41:24.117 client 131.111.56.28#35569 (answers.ssh.com): query failed (SERVFAIL) for answers.ssh.com/IN/A at query.c:7005 24-Apr-2014 16:41:24.117 fetch completed at resolver.c:4173 for answers.ssh.com/A in 1.028114: broken trust chain/broken trust chain [domain:ssh.com,referral:1,restart:1,qrysent:1,timeout:0,lame:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:1] Questions: Why is it attempting an insecurity proof? Why is there a bad cache hit for one of the DS queries? With a bit more debugging turned on we see that named is getting a response from the authoritative server without EDNS and without DNSSEC (see below). Is it omitting EDNS from its query, and if so why? rndc flushname on answers.ssh.com and ssh.com and all the name servers for ssh.com doesn't fix it. (If I understand it correctly, in 9.9 flushname should clear an entry from the bad cache but flushtree does not. The latter is improved in 9.10.) It might be nice at this debugging level to log queries as well as responses, and the source and destination addresses of packets. 24-Apr-2014 17:55:31.395 resquery 126e5060 (fctx 18262460(answers.ssh.com/A)): response 24-Apr-2014 17:55:31.395 received packet: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62966 ;; flags: qr aa; QUESTION: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 2 ;; QUESTION SECTION: ;answers.ssh.com. IN A ;; ANSWER SECTION: answers.ssh.com. 3600 IN A 194.137.52.201 ;; AUTHORITY SECTION: ssh.com. 3600 IN NS pdns02.domaincontrol.com. ssh.com. 3600 IN NS pdns01.domaincontrol.com. ssh.com. 3600 IN NS ns2.ssh.com. ssh.com. 3600 IN NS ns1.ssh.com. ;; ADDITIONAL SECTION: ns2.ssh.com. 600 IN A 208.109.255.50 ns1.ssh.com. 600 IN A 216.69.185.50 Tony. -- f.anthony.n.finch <d...@dotat.at> http://dotat.at/ Lundy: Variable 4, becoming southeast 5 or 6. Slight or moderate. Showers. Good, occasionally moderate. _______________________________________________ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users