Re: Getting a formerr 'invalid response' for winqual.microsoft.com. but dig +trace works.

2012-02-09 Thread Matt Doughty
It seems like multiple things are wrong, but I'm still trying to
understand what part of the breakage is causing Bind to throw out the
response with the formerr 'invalid response'.  Is this broken for
everyone using bind 9.7 or later?  I can just forward this zone to
HonestDNS, which happily serves up the data, and lodge a complaint
with Microsoft to fix their servers, but I want to make sure there
isn't something wrong somewhere in my network that is causing this
problem.

thanks,

--Matt

On Wed, Feb 8, 2012 at 8:05 PM, David Miller dmil...@tiggee.com wrote:
 On 2/8/2012 10:32 PM, Matt Doughty wrote:

 I have spend the afternoon trying to figure this out. The response I
 get back from their nameserver looks fine to me, and dig +trace works
 fine, but a regular dig returns a servfail. I have looked at the code
 for invalid response, but I don't quite follow what is going on there,
 and the comment 'responder is insane' leaves something to be desired.
 Any help would be appreciated here. I have included the dig +trace
 output below:

 dig +trace winqual.partners.extranet.microsoft.com.

 ;  DiG 9.7.0-P1  +trace winqual.partners.extranet.microsoft.com.
 ;; global options: +cmd
 .                       518004  IN      NS      j.root-servers.net.
 .                       518004  IN      NS      e.root-servers.net.
 .                       518004  IN      NS      l.root-servers.net.
 .                       518004  IN      NS      c.root-servers.net.
 .                       518004  IN      NS      m.root-servers.net.
 .                       518004  IN      NS      d.root-servers.net.
 .                       518004  IN      NS      b.root-servers.net.
 .                       518004  IN      NS      h.root-servers.net.
 .                       518004  IN      NS      k.root-servers.net.
 .                       518004  IN      NS      a.root-servers.net.
 .                       518004  IN      NS      g.root-servers.net.
 .                       518004  IN      NS      i.root-servers.net.
 .                       518004  IN      NS      f.root-servers.net.
 ;; Received 228 bytes from 172.16.255.1#53(172.16.255.1) in 1 ms

 com.                    172800  IN      NS      h.gtld-servers.net.
 com.                    172800  IN      NS      f.gtld-servers.net.
 com.                    172800  IN      NS      m.gtld-servers.net.
 com.                    172800  IN      NS      g.gtld-servers.net.
 com.                    172800  IN      NS      l.gtld-servers.net.
 com.                    172800  IN      NS      c.gtld-servers.net.
 com.                    172800  IN      NS      d.gtld-servers.net.
 com.                    172800  IN      NS      a.gtld-servers.net.
 com.                    172800  IN      NS      b.gtld-servers.net.
 com.                    172800  IN      NS      i.gtld-servers.net.
 com.                    172800  IN      NS      j.gtld-servers.net.
 com.                    172800  IN      NS      e.gtld-servers.net.
 com.                    172800  IN      NS      k.gtld-servers.net.
 ;; Received 497 bytes from 192.33.4.12#53(c.root-servers.net) in 18 ms

 microsoft.com.          172800  IN      NS      ns3.msft.net.
 microsoft.com.          172800  IN      NS      ns1.msft.net.
 microsoft.com.          172800  IN      NS      ns5.msft.net.
 microsoft.com.          172800  IN      NS      ns2.msft.net.
 microsoft.com.          172800  IN      NS      ns4.msft.net.
 ;; Received 235 bytes from 192.43.172.30#53(i.gtld-servers.net) in 67 ms

 partners.extranet.microsoft.com. 3600 IN NS     dns10.one.microsoft.com.
 partners.extranet.microsoft.com. 3600 IN NS     dns13.one.microsoft.com.
 partners.extranet.microsoft.com. 3600 IN NS     dns11.one.microsoft.com.
 partners.extranet.microsoft.com. 3600 IN NS     dns12.one.microsoft.com.
 ;; Received 236 bytes from 64.4.59.173#53(ns2.msft.net) in 3 ms

 winqual.partners.extranet.microsoft.com. 10 IN A 131.107.97.31
 ;; Received 112 bytes from 131.107.125.65#53(dns10.one.microsoft.com) in
 23 ms


 If I just dig at their servers for NS, I get a trunc and retry over TCP that
 times out.

 If I signal a bufsize, I get back a 777 byte response with NS that don't
 match the parent and an additional full of private 10/8 addresses

 # dig +norecurse +bufsize=1024 ns partners.extranet.microsoft.com
 @dns10.one.microsoft.com.

 ;  DiG 9.8.1  +norecurse +bufsize=1024 ns
 partners.extranet.microsoft.com @dns10.one.microsoft.com.

 ;; global options: +cmd
 ;; Got answer:
 ;; -HEADER- opcode: QUERY, status: NOERROR, id: 10678
 ;; flags: qr ra; QUERY: 1, ANSWER: 16, AUTHORITY: 0, ADDITIONAL: 17

 ;; OPT PSEUDOSECTION:
 ; EDNS: version: 0, flags:; udp: 4000
 ;; QUESTION SECTION:
 ;partners.extranet.microsoft.com. IN    NS

 ;; ANSWER SECTION:
 partners.extranet.microsoft.com. 1076 IN NS
 tk5-ptnr-dc-02.partners.extranet.microsoft.com.
 partners.extranet.microsoft.com. 1076 IN NS
 kaw-ptnr-dc-02.partners.extranet.microsoft.com.
 partners.extranet.microsoft.com. 

RE: Getting a formerr 'invalid response' for winqual.microsoft.com. but dig +trace works.

2012-02-09 Thread Spain, Dr. Jeffry A.
 It's because a few load balancer vendors don't read freely available 
 specifications but instead appear to reverse engineer the protocol and get it 
 wrong.

 BIND 9.7.0 fixed a long standing of accepting glue promoted to answer by 
 parent nameservers.  Once we did that there was no need to accept none aa 
 answers from servers that have been listed as being authoritative for the 
 zone.  This allowed the resolver to ignore broken authoritative servers.

 This got relaxed in later release of BIND 9.7.

It's fairly easy for me to deploy a VM and build a particular version of bind. 
Below is your query run on 9.7.0-P1 and 9.7.4-P1. It fails on the former and 
succeeds on the latter, as suggested by Mark Andrews above. Are you in a 
position to upgrade bind? Jeff.


;  DiG 9.7.0-P1  @localhost winqual.partners.extranet.microsoft.com.
; (1 server found)
;; global options: +cmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 28201
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;winqual.partners.extranet.microsoft.com. IN A

;; Query time: 1744 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Feb  9 19:36:51 2012
;; MSG SIZE  rcvd: 57


;  DiG 9.7.4-P1  @localhost winqual.partners.extranet.microsoft.com.
; (1 server found)
;; global options: +cmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: NOERROR, id: 47557
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0

;; QUESTION SECTION:
;winqual.partners.extranet.microsoft.com. IN A

;; ANSWER SECTION:
winqual.partners.extranet.microsoft.com. 10 IN A 131.107.97.31

;; AUTHORITY SECTION:
partners.extranet.microsoft.com. 3600 IN NS dns11.one.microsoft.com.
partners.extranet.microsoft.com. 3600 IN NS dns12.one.microsoft.com.
partners.extranet.microsoft.com. 3600 IN NS dns10.one.microsoft.com.
partners.extranet.microsoft.com. 3600 IN NS dns13.one.microsoft.com.

;; Query time: 668 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Feb  9 19:15:58 2012
;; MSG SIZE  rcvd: 157

___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Getting a formerr 'invalid response' for winqual.microsoft.com. but dig +trace works.

2012-02-09 Thread Matt Doughty
I would have to back port right now, and I have a work around that
will work until the we bump our fleet to a newer version. I was mostly
concerned about whether it was something in our network causing the
problem.

Thanks for all the help guys,

--Matt

On Thu, Feb 9, 2012 at 4:42 PM, Spain, Dr. Jeffry A.
spa...@countryday.net wrote:
 It's because a few load balancer vendors don't read freely available 
 specifications but instead appear to reverse engineer the protocol and get 
 it wrong.

 BIND 9.7.0 fixed a long standing of accepting glue promoted to answer by 
 parent nameservers.  Once we did that there was no need to accept none aa 
 answers from servers that have been listed as being authoritative for the 
 zone.  This allowed the resolver to ignore broken authoritative servers.

 This got relaxed in later release of BIND 9.7.

 It's fairly easy for me to deploy a VM and build a particular version of 
 bind. Below is your query run on 9.7.0-P1 and 9.7.4-P1. It fails on the 
 former and succeeds on the latter, as suggested by Mark Andrews above. Are 
 you in a position to upgrade bind? Jeff.


 ;  DiG 9.7.0-P1  @localhost winqual.partners.extranet.microsoft.com.
 ; (1 server found)
 ;; global options: +cmd
 ;; Got answer:
 ;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 28201
 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

 ;; QUESTION SECTION:
 ;winqual.partners.extranet.microsoft.com. IN A

 ;; Query time: 1744 msec
 ;; SERVER: 127.0.0.1#53(127.0.0.1)
 ;; WHEN: Thu Feb  9 19:36:51 2012
 ;; MSG SIZE  rcvd: 57


 ;  DiG 9.7.4-P1  @localhost winqual.partners.extranet.microsoft.com.
 ; (1 server found)
 ;; global options: +cmd
 ;; Got answer:
 ;; -HEADER- opcode: QUERY, status: NOERROR, id: 47557
 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0

 ;; QUESTION SECTION:
 ;winqual.partners.extranet.microsoft.com. IN A

 ;; ANSWER SECTION:
 winqual.partners.extranet.microsoft.com. 10 IN A 131.107.97.31

 ;; AUTHORITY SECTION:
 partners.extranet.microsoft.com. 3600 IN NS     dns11.one.microsoft.com.
 partners.extranet.microsoft.com. 3600 IN NS     dns12.one.microsoft.com.
 partners.extranet.microsoft.com. 3600 IN NS     dns10.one.microsoft.com.
 partners.extranet.microsoft.com. 3600 IN NS     dns13.one.microsoft.com.

 ;; Query time: 668 msec
 ;; SERVER: 127.0.0.1#53(127.0.0.1)
 ;; WHEN: Thu Feb  9 19:15:58 2012
 ;; MSG SIZE  rcvd: 157




-- 
--Matt
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: Getting a formerr 'invalid response' for winqual.microsoft.com. but dig +trace works.

2012-02-08 Thread David Miller

On 2/8/2012 10:32 PM, Matt Doughty wrote:

I have spend the afternoon trying to figure this out. The response I
get back from their nameserver looks fine to me, and dig +trace works
fine, but a regular dig returns a servfail. I have looked at the code
for invalid response, but I don't quite follow what is going on there,
and the comment 'responder is insane' leaves something to be desired.
Any help would be appreciated here. I have included the dig +trace
output below:

dig +trace winqual.partners.extranet.microsoft.com.

;  DiG 9.7.0-P1  +trace winqual.partners.extranet.microsoft.com.
;; global options: +cmd
.   518004  IN  NS  j.root-servers.net.
.   518004  IN  NS  e.root-servers.net.
.   518004  IN  NS  l.root-servers.net.
.   518004  IN  NS  c.root-servers.net.
.   518004  IN  NS  m.root-servers.net.
.   518004  IN  NS  d.root-servers.net.
.   518004  IN  NS  b.root-servers.net.
.   518004  IN  NS  h.root-servers.net.
.   518004  IN  NS  k.root-servers.net.
.   518004  IN  NS  a.root-servers.net.
.   518004  IN  NS  g.root-servers.net.
.   518004  IN  NS  i.root-servers.net.
.   518004  IN  NS  f.root-servers.net.
;; Received 228 bytes from 172.16.255.1#53(172.16.255.1) in 1 ms

com.172800  IN  NS  h.gtld-servers.net.
com.172800  IN  NS  f.gtld-servers.net.
com.172800  IN  NS  m.gtld-servers.net.
com.172800  IN  NS  g.gtld-servers.net.
com.172800  IN  NS  l.gtld-servers.net.
com.172800  IN  NS  c.gtld-servers.net.
com.172800  IN  NS  d.gtld-servers.net.
com.172800  IN  NS  a.gtld-servers.net.
com.172800  IN  NS  b.gtld-servers.net.
com.172800  IN  NS  i.gtld-servers.net.
com.172800  IN  NS  j.gtld-servers.net.
com.172800  IN  NS  e.gtld-servers.net.
com.172800  IN  NS  k.gtld-servers.net.
;; Received 497 bytes from 192.33.4.12#53(c.root-servers.net) in 18 ms

microsoft.com.  172800  IN  NS  ns3.msft.net.
microsoft.com.  172800  IN  NS  ns1.msft.net.
microsoft.com.  172800  IN  NS  ns5.msft.net.
microsoft.com.  172800  IN  NS  ns2.msft.net.
microsoft.com.  172800  IN  NS  ns4.msft.net.
;; Received 235 bytes from 192.43.172.30#53(i.gtld-servers.net) in 67 ms

partners.extranet.microsoft.com. 3600 IN NS dns10.one.microsoft.com.
partners.extranet.microsoft.com. 3600 IN NS dns13.one.microsoft.com.
partners.extranet.microsoft.com. 3600 IN NS dns11.one.microsoft.com.
partners.extranet.microsoft.com. 3600 IN NS dns12.one.microsoft.com.
;; Received 236 bytes from 64.4.59.173#53(ns2.msft.net) in 3 ms

winqual.partners.extranet.microsoft.com. 10 IN A 131.107.97.31
;; Received 112 bytes from 131.107.125.65#53(dns10.one.microsoft.com) in 23 ms



If I just dig at their servers for NS, I get a trunc and retry over TCP 
that times out.


If I signal a bufsize, I get back a 777 byte response with NS that don't 
match the parent and an additional full of private 10/8 addresses


# dig +norecurse +bufsize=1024 ns partners.extranet.microsoft.com 
@dns10.one.microsoft.com.


;  DiG 9.8.1  +norecurse +bufsize=1024 ns 
partners.extranet.microsoft.com @dns10.one.microsoft.com.

;; global options: +cmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: NOERROR, id: 10678
;; flags: qr ra; QUERY: 1, ANSWER: 16, AUTHORITY: 0, ADDITIONAL: 17

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;partners.extranet.microsoft.com. INNS

;; ANSWER SECTION:
partners.extranet.microsoft.com. 1076 IN NS 
tk5-ptnr-dc-02.partners.extranet.microsoft.com.
partners.extranet.microsoft.com. 1076 IN NS 
kaw-ptnr-dc-02.partners.extranet.microsoft.com.
partners.extranet.microsoft.com. 1076 IN NS 
co2-ptnr-dc-02.partners.extranet.microsoft.com.
partners.extranet.microsoft.com. 1076 IN NS 
co2-ptnr-dc-01.partners.extranet.microsoft.com.
partners.extranet.microsoft.com. 1076 IN NS 
tk5-ptnr-dc-01.partners.extranet.microsoft.com.
partners.extranet.microsoft.com. 1076 IN NS 
db3-ptnr-dc-02.partners.extranet.microsoft.com.
partners.extranet.microsoft.com. 1076 IN NS 
db3-ptnr-dc-01.partners.extranet.microsoft.com.
partners.extranet.microsoft.com. 1076 IN NS 
tk5-ptnr-dc-03.partners.extranet.microsoft.com.
partners.extranet.microsoft.com. 1076 IN NS