Re: Spotty Lookups on One of Our Networks

2012-10-31 Thread Martin McCormick
The system hung long enough to have timed out on every
possible DNS that it could have tried so it should have gotten
to one.

Barry Margolin writes:
> Did the problem coincide with Hurricane Sandy? That would explain
> inability to reach many east coast servers. Resolvers should work around
> this by failing over to other servers (assuming the organization has
> them geographically distributed, as NOAA.GOV does), but dig +trace
> doesn't.

Thank you very much for your suggestions. 
We are more or less in a waiting mode right now as the
network staff on our remote campus check some settings on their
firewall. We know now this is almost certainly not a bind issue
as we have discovered many remote networks that seem to have no
TCP/IP connectivity from the remote campus but are perfectly
reachable from here. 

We started receiving complaints about a week ago so the
hurricane is not to blame.

I will let the group know what happened as soon as we
find out, ourselves.

Martin McCormick
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Spotty Lookups on One of Our Networks

2012-10-31 Thread John Miller
Martin, what do you see if you do a packet capture on the host where you're
running dig?  How 'bout at the border of your network?  Obviously traffic's
not making it through, but where?  Any sort of split routing paths that
might be involved?

John

On Wed, Oct 31, 2012 at 8:54 AM, Martin McCormick  wrote:

> I described a case where one of our remote campuses can't
> resolve a number of remote domains. One example is noaa.gov. It
> also successfully resolves random remote domains without
> seemingly any rime or reason.
>
> Here is a bad dig trace for noaa.gov
>
>
> ; <<>> DiG 9.7.7 <<>> @localhost +trace noaa.gov
> ; (2 servers found)
> ;; global options: +cmd
> .   453464  IN  NS  b.root-servers.net.
> .   453464  IN  NS  l.root-servers.net.
> .   453464  IN  NS  a.root-servers.net.
> .   453464  IN  NS  i.root-servers.net.
> .   453464  IN  NS  j.root-servers.net.
> .   453464  IN  NS  f.root-servers.net.
> .   453464  IN  NS  g.root-servers.net.
> .   453464  IN  NS  e.root-servers.net.
> .   453464  IN  NS  h.root-servers.net.
> .   453464  IN  NS  d.root-servers.net.
> .   453464  IN  NS  c.root-servers.net.
> .   453464  IN  NS  k.root-servers.net.
> .   453464  IN  NS  m.root-servers.net.
> ;; Received 512 bytes from 127.0.0.1#53(127.0.0.1) in 320 ms
>
> gov.172800  IN  NS  b.gov-servers.net.
> gov.172800  IN  NS  a.gov-servers.net.
> ;; Received 133 bytes from 192.58.128.30#53(192.58.128.30) in 210 ms
>
> noaa.gov.   86400   IN  NS  ns-e.noaa.gov.
> noaa.gov.   86400   IN  NS  ns-mw.noaa.gov.
> noaa.gov.   86400   IN  NS  ns-nw.noaa.gov.
>
> This trace took several minutes since no successful
> resolution was made.
>
> Here is a good trace using our DNS.
>
>
> ; <<>> DiG 9.8.1-P1 <<>> +trace @localhost noaa.gov
> ; (2 servers found)
> ;; global options: +cmd
> .   369104  IN  NS  d.root-servers.net.
> .   369104  IN  NS  j.root-servers.net.
> .   369104  IN  NS  b.root-servers.net.
> .   369104  IN  NS  g.root-servers.net.
> .   369104  IN  NS  i.root-servers.net.
> .   369104  IN  NS  e.root-servers.net.
> .   369104  IN  NS  l.root-servers.net.
> .   369104  IN  NS  m.root-servers.net.
> .   369104  IN  NS  h.root-servers.net.
> .   369104  IN  NS  f.root-servers.net.
> .   369104  IN  NS  c.root-servers.net.
> .   369104  IN  NS  a.root-servers.net.
> .   369104  IN  NS  k.root-servers.net.
> ;; Received 512 bytes from 127.0.0.1#53(127.0.0.1) in 497 ms
>
> gov.172800  IN  NS  a.gov-servers.net.
> gov.172800  IN  NS  b.gov-servers.net.
> ;; Received 133 bytes from 192.112.36.4#53(192.112.36.4) in 439 ms
>
> noaa.gov.   86400   IN  NS  ns-e.noaa.gov.
> noaa.gov.   86400   IN  NS  ns-mw.noaa.gov.
> noaa.gov.   86400   IN  NS  ns-nw.noaa.gov.
> ;; Received 133 bytes from 69.36.157.30#53(69.36.157.30) in 224 ms
>
> noaa.gov.   86400   IN  A   140.90.200.21
> noaa.gov.   86400   IN  A   140.172.17.21
> noaa.gov.   86400   IN  A   129.15.96.21
> noaa.gov.   86400   IN  NS  ns-e.noaa.gov.
> noaa.gov.   86400   IN  NS  ns-mw.noaa.gov.
> noaa.gov.   86400   IN  NS  ns-nw.noaa.gov.
> ;; Received 181 bytes from 140.90.33.237#53(140.90.33.237) in 37 ms
>
> Barry Margolin writes:
> > I'm not sure what you mean by that sentence about getting authoritative
> > DNSs from X when it sbould be from Y. Can you post the actual dig?
> >
> > BTW, @servername doesn't mean much when using +trace, since +trace
> > queries the servers listed in NS records, not a resolver.
> ___
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to
> unsubscribe from this list
>
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
>



-- 
John Miller
Systems Engineer
Brandeis University
johnm...@brandeis.edu
(781) 736-4619
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 

Re: Spotty Lookups on One of Our Networks

2012-10-31 Thread Barry Margolin
In article ,
 Carsten Strotmann  wrote:

> Hello Martin,
> 
> Martin McCormick  writes:
> 
> > I described a case where one of our remote campuses can't
> > resolve a number of remote domains. One example is noaa.gov. It
> > also successfully resolves random remote domains without
> > seemingly any rime or reason.
> >
> > Here is a bad dig trace for noaa.gov
> >
> [...]
> 
>  shows that
> nameserver ns-e.noaa.gov is not responding
> 
> The dig +trace might "hang" if that authoritative DNS server is selected
> for the query. 
> 
> "ns-mw.noaa.gov" and "ns-nw.noaa.gov" operate fine. "ns-e" could mean
> "east coast".

Did the problem coincide with Hurricane Sandy? That would explain 
inability to reach many east coast servers. Resolvers should work around 
this by failing over to other servers (assuming the organization has 
them geographically distributed, as NOAA.GOV does), but dig +trace 
doesn't.

-- 
Barry Margolin
Arlington, MA
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Spotty Lookups on One of Our Networks

2012-10-31 Thread Carsten Strotmann

Hello Martin,

Martin McCormick  writes:

> I described a case where one of our remote campuses can't
> resolve a number of remote domains. One example is noaa.gov. It
> also successfully resolves random remote domains without
> seemingly any rime or reason.
>
>   Here is a bad dig trace for noaa.gov
>
[...]

 shows that
nameserver ns-e.noaa.gov is not responding

The dig +trace might "hang" if that authoritative DNS server is selected
for the query. 

"ns-mw.noaa.gov" and "ns-nw.noaa.gov" operate fine. "ns-e" could mean
"east coast".

-- Carsten
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Spotty Lookups on One of Our Networks

2012-10-31 Thread Martin McCormick
I described a case where one of our remote campuses can't
resolve a number of remote domains. One example is noaa.gov. It
also successfully resolves random remote domains without
seemingly any rime or reason.

Here is a bad dig trace for noaa.gov


; <<>> DiG 9.7.7 <<>> @localhost +trace noaa.gov
; (2 servers found)
;; global options: +cmd
.   453464  IN  NS  b.root-servers.net.
.   453464  IN  NS  l.root-servers.net.
.   453464  IN  NS  a.root-servers.net.
.   453464  IN  NS  i.root-servers.net.
.   453464  IN  NS  j.root-servers.net.
.   453464  IN  NS  f.root-servers.net.
.   453464  IN  NS  g.root-servers.net.
.   453464  IN  NS  e.root-servers.net.
.   453464  IN  NS  h.root-servers.net.
.   453464  IN  NS  d.root-servers.net.
.   453464  IN  NS  c.root-servers.net.
.   453464  IN  NS  k.root-servers.net.
.   453464  IN  NS  m.root-servers.net.
;; Received 512 bytes from 127.0.0.1#53(127.0.0.1) in 320 ms

gov.172800  IN  NS  b.gov-servers.net.
gov.172800  IN  NS  a.gov-servers.net.
;; Received 133 bytes from 192.58.128.30#53(192.58.128.30) in 210 ms

noaa.gov.   86400   IN  NS  ns-e.noaa.gov.
noaa.gov.   86400   IN  NS  ns-mw.noaa.gov.
noaa.gov.   86400   IN  NS  ns-nw.noaa.gov.

This trace took several minutes since no successful
resolution was made.

Here is a good trace using our DNS.


; <<>> DiG 9.8.1-P1 <<>> +trace @localhost noaa.gov
; (2 servers found)
;; global options: +cmd
.   369104  IN  NS  d.root-servers.net.
.   369104  IN  NS  j.root-servers.net.
.   369104  IN  NS  b.root-servers.net.
.   369104  IN  NS  g.root-servers.net.
.   369104  IN  NS  i.root-servers.net.
.   369104  IN  NS  e.root-servers.net.
.   369104  IN  NS  l.root-servers.net.
.   369104  IN  NS  m.root-servers.net.
.   369104  IN  NS  h.root-servers.net.
.   369104  IN  NS  f.root-servers.net.
.   369104  IN  NS  c.root-servers.net.
.   369104  IN  NS  a.root-servers.net.
.   369104  IN  NS  k.root-servers.net.
;; Received 512 bytes from 127.0.0.1#53(127.0.0.1) in 497 ms

gov.172800  IN  NS  a.gov-servers.net.
gov.172800  IN  NS  b.gov-servers.net.
;; Received 133 bytes from 192.112.36.4#53(192.112.36.4) in 439 ms

noaa.gov.   86400   IN  NS  ns-e.noaa.gov.
noaa.gov.   86400   IN  NS  ns-mw.noaa.gov.
noaa.gov.   86400   IN  NS  ns-nw.noaa.gov.
;; Received 133 bytes from 69.36.157.30#53(69.36.157.30) in 224 ms

noaa.gov.   86400   IN  A   140.90.200.21
noaa.gov.   86400   IN  A   140.172.17.21
noaa.gov.   86400   IN  A   129.15.96.21
noaa.gov.   86400   IN  NS  ns-e.noaa.gov.
noaa.gov.   86400   IN  NS  ns-mw.noaa.gov.
noaa.gov.   86400   IN  NS  ns-nw.noaa.gov.
;; Received 181 bytes from 140.90.33.237#53(140.90.33.237) in 37 ms

Barry Margolin writes:
> I'm not sure what you mean by that sentence about getting authoritative
> DNSs from X when it sbould be from Y. Can you post the actual dig?
> 
> BTW, @servername doesn't mean much when using +trace, since +trace
> queries the servers listed in NS records, not a resolver.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Spotty Lookups on One of Our Networks

2012-10-30 Thread Mark Andrews

In message <201210302010.q9ukaytl064...@x.it.okstate.edu>, Martin McCormick wri
tes:
>   I don't thing this is a bind problem because this
> particular network has some Microsoft DNS's that are doing
> exactly the same thing.
> 
>   There are several domains names that are broken in this
> network and the symptome is always the same:
> 
>   Dig +trace @localhost one.bad.domain.com.
>
>   We see all the root name servers listed. We get the TLD
> servers next and, from one of them, we get authoritative DNS's
> from bad.domain.com where we should get the IP address from
> one.bad.domain.com. That is where it breaks. It times out with
> no authoritative servers that will talk to us. One site is
> noaa.gov which is the National Oceanic and Atmospheric
> Administration in the United States.

Newer versions of dig turn on +dnssec with +trace (you can do +trace
+nodnssec +noedns to get the old behaviour back) as that better
reflects what the nameserver does.  The nameserver will retry with
a lower EDNS UDP buffer size, dig won't.

They are most probably dropping IP fragments at the firewall.  Fixing
the 512 byte limit (below) is only the first step.

>   We have no trouble at all resolving them from our
> network so I filled in the missing information for the
> authoritative domain name servers and hard-coded one or two of
> them in lookups on the problem network and, no surprise, the
> lookup still times out.
> 
>   One really good lead evaporated when I discovered that
> this network still had a 512-byte limit on its firewall so we
> thought this might be the problem but no such luck. The firewall
> now passes edns packets just fine, but nothing has really
> changed.
> 
>   Any ideas as to what prevents some lookups from
> resolving. Others do resolve.
> 
>   We have been kicking this problem around for about a
> week and the customers, there, are getting a bit restless. They
> are connected to the same ISP we are and we are not having any
> problems like this. 
> 
>   There seems to be no reason  why some remote domains
> work and others don't. I am asking on this list in hopes that
> somebody has seen something like this somewhere else and found
> the cause.
> 
> Thank you.
> 
> Martin McCormick WB5AGZ  Stillwater, OK 
> Systems Engineer
> OSU Information Technology Department Telecommunications Services Group
> ___
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe
>  from this list
> 
> bind-users mailing list
> bind-users@lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Spotty Lookups on One of Our Networks

2012-10-30 Thread Martin McCormick
John Miller writes:
> Just to clarify, how many domain names are doing this for you? Are they 
> all
> remote domains, or are some of them okstate.edu domains?

They are all remote as far as I can tell. 

I will have some answers for Barry Margolin's questions a bit
later. It seems like the tear of DNS's closest to the failed
lookup is that what is failing to be reachable.

My own theory is that it is specific to port 53.

Thanks to both.
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Spotty Lookups on One of Our Networks

2012-10-30 Thread John Miller

Hi Martin,

Just to clarify, how many domain names are doing this for you?  Are they 
all remote domains, or are some of them okstate.edu domains?


John
--
John Miller
Systems Engineer
Brandeis University
johnm...@brandeis.edu

On 10/30/2012 04:10 PM, Martin McCormick wrote:

I don't thing this is a bind problem because this
particular network has some Microsoft DNS's that are doing
exactly the same thing.

There are several domains names that are broken in this
network and the symptome is always the same:

Dig +trace @localhost one.bad.domain.com.

We see all the root name servers listed. We get the TLD
servers next and, from one of them, we get authoritative DNS's
from bad.domain.com where we should get the IP address from
one.bad.domain.com. That is where it breaks. It times out with
no authoritative servers that will talk to us. One site is
noaa.gov which is the National Oceanic and Atmospheric
Administration in the United States.

We have no trouble at all resolving them from our
network so I filled in the missing information for the
authoritative domain name servers and hard-coded one or two of
them in lookups on the problem network and, no surprise, the
lookup still times out.

One really good lead evaporated when I discovered that
this network still had a 512-byte limit on its firewall so we
thought this might be the problem but no such luck. The firewall
now passes edns packets just fine, but nothing has really
changed.

Any ideas as to what prevents some lookups from
resolving. Others do resolve.

We have been kicking this problem around for about a
week and the customers, there, are getting a bit restless. They
are connected to the same ISP we are and we are not having any
problems like this.

There seems to be no reason  why some remote domains
work and others don't. I am asking on this list in hopes that
somebody has seen something like this somewhere else and found
the cause.

Thank you.

Martin McCormick WB5AGZ  Stillwater, OK
Systems Engineer
OSU Information Technology Department Telecommunications Services Group
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Spotty Lookups on One of Our Networks

2012-10-30 Thread Barry Margolin
In article ,
 Martin McCormick  wrote:

>   I don't thing this is a bind problem because this
> particular network has some Microsoft DNS's that are doing
> exactly the same thing.
> 
>   There are several domains names that are broken in this
> network and the symptome is always the same:
> 
>   Dig +trace @localhost one.bad.domain.com.
> 
>   We see all the root name servers listed. We get the TLD
> servers next and, from one of them, we get authoritative DNS's
> from bad.domain.com where we should get the IP address from
> one.bad.domain.com. That is where it breaks. It times out with

I'm not sure what you mean by that sentence about getting authoritative 
DNSs from X when it sbould be from Y. Can you post the actual dig?

BTW, @servername doesn't mean much when using +trace, since +trace 
queries the servers listed in NS records, not a resolver.

> no authoritative servers that will talk to us. One site is
> noaa.gov which is the National Oceanic and Atmospheric
> Administration in the United States.
> 
>   We have no trouble at all resolving them from our
> network so I filled in the missing information for the
> authoritative domain name servers and hard-coded one or two of
> them in lookups on the problem network and, no surprise, the
> lookup still times out.

What happens if you try to telnet to port 53 on the auth nameservers 
from your local resolvers? What about traceroute?

> 
>   One really good lead evaporated when I discovered that
> this network still had a 512-byte limit on its firewall so we
> thought this might be the problem but no such luck. The firewall
> now passes edns packets just fine, but nothing has really
> changed.
> 
>   Any ideas as to what prevents some lookups from
> resolving. Others do resolve.
> 
>   We have been kicking this problem around for about a
> week and the customers, there, are getting a bit restless. They
> are connected to the same ISP we are and we are not having any
> problems like this. 
> 
>   There seems to be no reason  why some remote domains
> work and others don't. I am asking on this list in hopes that
> somebody has seen something like this somewhere else and found
> the cause.
> 
> Thank you.
> 
> Martin McCormick WB5AGZ  Stillwater, OK 
> Systems Engineer
> OSU Information Technology Department Telecommunications Services Group

-- 
Barry Margolin
Arlington, MA
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Spotty Lookups on One of Our Networks

2012-10-30 Thread Martin McCormick
I don't thing this is a bind problem because this
particular network has some Microsoft DNS's that are doing
exactly the same thing.

There are several domains names that are broken in this
network and the symptome is always the same:

Dig +trace @localhost one.bad.domain.com.

We see all the root name servers listed. We get the TLD
servers next and, from one of them, we get authoritative DNS's
from bad.domain.com where we should get the IP address from
one.bad.domain.com. That is where it breaks. It times out with
no authoritative servers that will talk to us. One site is
noaa.gov which is the National Oceanic and Atmospheric
Administration in the United States.

We have no trouble at all resolving them from our
network so I filled in the missing information for the
authoritative domain name servers and hard-coded one or two of
them in lookups on the problem network and, no surprise, the
lookup still times out.

One really good lead evaporated when I discovered that
this network still had a 512-byte limit on its firewall so we
thought this might be the problem but no such luck. The firewall
now passes edns packets just fine, but nothing has really
changed.

Any ideas as to what prevents some lookups from
resolving. Others do resolve.

We have been kicking this problem around for about a
week and the customers, there, are getting a bit restless. They
are connected to the same ISP we are and we are not having any
problems like this. 

There seems to be no reason  why some remote domains
work and others don't. I am asking on this list in hopes that
somebody has seen something like this somewhere else and found
the cause.

Thank you.

Martin McCormick WB5AGZ  Stillwater, OK 
Systems Engineer
OSU Information Technology Department Telecommunications Services Group
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users