Re: Getting a formerr 'invalid response' for winqual.microsoft.com. but dig +trace works.
It seems like multiple things are wrong, but I'm still trying to understand what part of the breakage is causing Bind to throw out the response with the formerr 'invalid response'. Is this broken for everyone using bind 9.7 or later? I can just forward this zone to HonestDNS, which happily serves up the data, and lodge a complaint with Microsoft to fix their servers, but I want to make sure there isn't something wrong somewhere in my network that is causing this problem. thanks, --Matt On Wed, Feb 8, 2012 at 8:05 PM, David Miller dmil...@tiggee.com wrote: On 2/8/2012 10:32 PM, Matt Doughty wrote: I have spend the afternoon trying to figure this out. The response I get back from their nameserver looks fine to me, and dig +trace works fine, but a regular dig returns a servfail. I have looked at the code for invalid response, but I don't quite follow what is going on there, and the comment 'responder is insane' leaves something to be desired. Any help would be appreciated here. I have included the dig +trace output below: dig +trace winqual.partners.extranet.microsoft.com. ; DiG 9.7.0-P1 +trace winqual.partners.extranet.microsoft.com. ;; global options: +cmd . 518004 IN NS j.root-servers.net. . 518004 IN NS e.root-servers.net. . 518004 IN NS l.root-servers.net. . 518004 IN NS c.root-servers.net. . 518004 IN NS m.root-servers.net. . 518004 IN NS d.root-servers.net. . 518004 IN NS b.root-servers.net. . 518004 IN NS h.root-servers.net. . 518004 IN NS k.root-servers.net. . 518004 IN NS a.root-servers.net. . 518004 IN NS g.root-servers.net. . 518004 IN NS i.root-servers.net. . 518004 IN NS f.root-servers.net. ;; Received 228 bytes from 172.16.255.1#53(172.16.255.1) in 1 ms com. 172800 IN NS h.gtld-servers.net. com. 172800 IN NS f.gtld-servers.net. com. 172800 IN NS m.gtld-servers.net. com. 172800 IN NS g.gtld-servers.net. com. 172800 IN NS l.gtld-servers.net. com. 172800 IN NS c.gtld-servers.net. com. 172800 IN NS d.gtld-servers.net. com. 172800 IN NS a.gtld-servers.net. com. 172800 IN NS b.gtld-servers.net. com. 172800 IN NS i.gtld-servers.net. com. 172800 IN NS j.gtld-servers.net. com. 172800 IN NS e.gtld-servers.net. com. 172800 IN NS k.gtld-servers.net. ;; Received 497 bytes from 192.33.4.12#53(c.root-servers.net) in 18 ms microsoft.com. 172800 IN NS ns3.msft.net. microsoft.com. 172800 IN NS ns1.msft.net. microsoft.com. 172800 IN NS ns5.msft.net. microsoft.com. 172800 IN NS ns2.msft.net. microsoft.com. 172800 IN NS ns4.msft.net. ;; Received 235 bytes from 192.43.172.30#53(i.gtld-servers.net) in 67 ms partners.extranet.microsoft.com. 3600 IN NS dns10.one.microsoft.com. partners.extranet.microsoft.com. 3600 IN NS dns13.one.microsoft.com. partners.extranet.microsoft.com. 3600 IN NS dns11.one.microsoft.com. partners.extranet.microsoft.com. 3600 IN NS dns12.one.microsoft.com. ;; Received 236 bytes from 64.4.59.173#53(ns2.msft.net) in 3 ms winqual.partners.extranet.microsoft.com. 10 IN A 131.107.97.31 ;; Received 112 bytes from 131.107.125.65#53(dns10.one.microsoft.com) in 23 ms If I just dig at their servers for NS, I get a trunc and retry over TCP that times out. If I signal a bufsize, I get back a 777 byte response with NS that don't match the parent and an additional full of private 10/8 addresses # dig +norecurse +bufsize=1024 ns partners.extranet.microsoft.com @dns10.one.microsoft.com. ; DiG 9.8.1 +norecurse +bufsize=1024 ns partners.extranet.microsoft.com @dns10.one.microsoft.com. ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 10678 ;; flags: qr ra; QUERY: 1, ANSWER: 16, AUTHORITY: 0, ADDITIONAL: 17 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4000 ;; QUESTION SECTION: ;partners.extranet.microsoft.com. IN NS ;; ANSWER SECTION: partners.extranet.microsoft.com. 1076 IN NS tk5-ptnr-dc-02.partners.extranet.microsoft.com. partners.extranet.microsoft.com. 1076 IN NS kaw-ptnr-dc-02.partners.extranet.microsoft.com. partners.extranet.microsoft.com.
RE: Getting a formerr 'invalid response' for winqual.microsoft.com. but dig +trace works.
It's because a few load balancer vendors don't read freely available specifications but instead appear to reverse engineer the protocol and get it wrong. BIND 9.7.0 fixed a long standing of accepting glue promoted to answer by parent nameservers. Once we did that there was no need to accept none aa answers from servers that have been listed as being authoritative for the zone. This allowed the resolver to ignore broken authoritative servers. This got relaxed in later release of BIND 9.7. It's fairly easy for me to deploy a VM and build a particular version of bind. Below is your query run on 9.7.0-P1 and 9.7.4-P1. It fails on the former and succeeds on the latter, as suggested by Mark Andrews above. Are you in a position to upgrade bind? Jeff. ; DiG 9.7.0-P1 @localhost winqual.partners.extranet.microsoft.com. ; (1 server found) ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 28201 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;winqual.partners.extranet.microsoft.com. IN A ;; Query time: 1744 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Thu Feb 9 19:36:51 2012 ;; MSG SIZE rcvd: 57 ; DiG 9.7.4-P1 @localhost winqual.partners.extranet.microsoft.com. ; (1 server found) ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 47557 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0 ;; QUESTION SECTION: ;winqual.partners.extranet.microsoft.com. IN A ;; ANSWER SECTION: winqual.partners.extranet.microsoft.com. 10 IN A 131.107.97.31 ;; AUTHORITY SECTION: partners.extranet.microsoft.com. 3600 IN NS dns11.one.microsoft.com. partners.extranet.microsoft.com. 3600 IN NS dns12.one.microsoft.com. partners.extranet.microsoft.com. 3600 IN NS dns10.one.microsoft.com. partners.extranet.microsoft.com. 3600 IN NS dns13.one.microsoft.com. ;; Query time: 668 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Thu Feb 9 19:15:58 2012 ;; MSG SIZE rcvd: 157 ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Getting a formerr 'invalid response' for winqual.microsoft.com. but dig +trace works.
I would have to back port right now, and I have a work around that will work until the we bump our fleet to a newer version. I was mostly concerned about whether it was something in our network causing the problem. Thanks for all the help guys, --Matt On Thu, Feb 9, 2012 at 4:42 PM, Spain, Dr. Jeffry A. spa...@countryday.net wrote: It's because a few load balancer vendors don't read freely available specifications but instead appear to reverse engineer the protocol and get it wrong. BIND 9.7.0 fixed a long standing of accepting glue promoted to answer by parent nameservers. Once we did that there was no need to accept none aa answers from servers that have been listed as being authoritative for the zone. This allowed the resolver to ignore broken authoritative servers. This got relaxed in later release of BIND 9.7. It's fairly easy for me to deploy a VM and build a particular version of bind. Below is your query run on 9.7.0-P1 and 9.7.4-P1. It fails on the former and succeeds on the latter, as suggested by Mark Andrews above. Are you in a position to upgrade bind? Jeff. ; DiG 9.7.0-P1 @localhost winqual.partners.extranet.microsoft.com. ; (1 server found) ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 28201 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;winqual.partners.extranet.microsoft.com. IN A ;; Query time: 1744 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Thu Feb 9 19:36:51 2012 ;; MSG SIZE rcvd: 57 ; DiG 9.7.4-P1 @localhost winqual.partners.extranet.microsoft.com. ; (1 server found) ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 47557 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0 ;; QUESTION SECTION: ;winqual.partners.extranet.microsoft.com. IN A ;; ANSWER SECTION: winqual.partners.extranet.microsoft.com. 10 IN A 131.107.97.31 ;; AUTHORITY SECTION: partners.extranet.microsoft.com. 3600 IN NS dns11.one.microsoft.com. partners.extranet.microsoft.com. 3600 IN NS dns12.one.microsoft.com. partners.extranet.microsoft.com. 3600 IN NS dns10.one.microsoft.com. partners.extranet.microsoft.com. 3600 IN NS dns13.one.microsoft.com. ;; Query time: 668 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Thu Feb 9 19:15:58 2012 ;; MSG SIZE rcvd: 157 -- --Matt ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Getting a formerr 'invalid response' for winqual.microsoft.com. but dig +trace works.
On 2/8/2012 10:32 PM, Matt Doughty wrote: I have spend the afternoon trying to figure this out. The response I get back from their nameserver looks fine to me, and dig +trace works fine, but a regular dig returns a servfail. I have looked at the code for invalid response, but I don't quite follow what is going on there, and the comment 'responder is insane' leaves something to be desired. Any help would be appreciated here. I have included the dig +trace output below: dig +trace winqual.partners.extranet.microsoft.com. ; DiG 9.7.0-P1 +trace winqual.partners.extranet.microsoft.com. ;; global options: +cmd . 518004 IN NS j.root-servers.net. . 518004 IN NS e.root-servers.net. . 518004 IN NS l.root-servers.net. . 518004 IN NS c.root-servers.net. . 518004 IN NS m.root-servers.net. . 518004 IN NS d.root-servers.net. . 518004 IN NS b.root-servers.net. . 518004 IN NS h.root-servers.net. . 518004 IN NS k.root-servers.net. . 518004 IN NS a.root-servers.net. . 518004 IN NS g.root-servers.net. . 518004 IN NS i.root-servers.net. . 518004 IN NS f.root-servers.net. ;; Received 228 bytes from 172.16.255.1#53(172.16.255.1) in 1 ms com.172800 IN NS h.gtld-servers.net. com.172800 IN NS f.gtld-servers.net. com.172800 IN NS m.gtld-servers.net. com.172800 IN NS g.gtld-servers.net. com.172800 IN NS l.gtld-servers.net. com.172800 IN NS c.gtld-servers.net. com.172800 IN NS d.gtld-servers.net. com.172800 IN NS a.gtld-servers.net. com.172800 IN NS b.gtld-servers.net. com.172800 IN NS i.gtld-servers.net. com.172800 IN NS j.gtld-servers.net. com.172800 IN NS e.gtld-servers.net. com.172800 IN NS k.gtld-servers.net. ;; Received 497 bytes from 192.33.4.12#53(c.root-servers.net) in 18 ms microsoft.com. 172800 IN NS ns3.msft.net. microsoft.com. 172800 IN NS ns1.msft.net. microsoft.com. 172800 IN NS ns5.msft.net. microsoft.com. 172800 IN NS ns2.msft.net. microsoft.com. 172800 IN NS ns4.msft.net. ;; Received 235 bytes from 192.43.172.30#53(i.gtld-servers.net) in 67 ms partners.extranet.microsoft.com. 3600 IN NS dns10.one.microsoft.com. partners.extranet.microsoft.com. 3600 IN NS dns13.one.microsoft.com. partners.extranet.microsoft.com. 3600 IN NS dns11.one.microsoft.com. partners.extranet.microsoft.com. 3600 IN NS dns12.one.microsoft.com. ;; Received 236 bytes from 64.4.59.173#53(ns2.msft.net) in 3 ms winqual.partners.extranet.microsoft.com. 10 IN A 131.107.97.31 ;; Received 112 bytes from 131.107.125.65#53(dns10.one.microsoft.com) in 23 ms If I just dig at their servers for NS, I get a trunc and retry over TCP that times out. If I signal a bufsize, I get back a 777 byte response with NS that don't match the parent and an additional full of private 10/8 addresses # dig +norecurse +bufsize=1024 ns partners.extranet.microsoft.com @dns10.one.microsoft.com. ; DiG 9.8.1 +norecurse +bufsize=1024 ns partners.extranet.microsoft.com @dns10.one.microsoft.com. ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 10678 ;; flags: qr ra; QUERY: 1, ANSWER: 16, AUTHORITY: 0, ADDITIONAL: 17 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4000 ;; QUESTION SECTION: ;partners.extranet.microsoft.com. INNS ;; ANSWER SECTION: partners.extranet.microsoft.com. 1076 IN NS tk5-ptnr-dc-02.partners.extranet.microsoft.com. partners.extranet.microsoft.com. 1076 IN NS kaw-ptnr-dc-02.partners.extranet.microsoft.com. partners.extranet.microsoft.com. 1076 IN NS co2-ptnr-dc-02.partners.extranet.microsoft.com. partners.extranet.microsoft.com. 1076 IN NS co2-ptnr-dc-01.partners.extranet.microsoft.com. partners.extranet.microsoft.com. 1076 IN NS tk5-ptnr-dc-01.partners.extranet.microsoft.com. partners.extranet.microsoft.com. 1076 IN NS db3-ptnr-dc-02.partners.extranet.microsoft.com. partners.extranet.microsoft.com. 1076 IN NS db3-ptnr-dc-01.partners.extranet.microsoft.com. partners.extranet.microsoft.com. 1076 IN NS tk5-ptnr-dc-03.partners.extranet.microsoft.com. partners.extranet.microsoft.com. 1076 IN NS