Re: Name-server redundancy

2014-06-10 Thread Kevin Darcy
You're right: I misinterpreted no name-server as no such host (aka 
NXDOMAIN), but actually your explanation makes more sense.


- Kevin
On 6/9/2014 6:07 PM, Barry Margolin wrote:

In article mailman.401.1402350461.26362.bind-us...@lists.isc.org,
  Kevin Darcy k...@chrysler.com wrote:


That scenario still shouldn't have led to an NXDOMAIN. If none of the
delegated nameservers are responding, you'd get a timeout or SERVFAIL.
So I think there's still some investigation to be done. But using dig
instead of nslookup at least makes things clearer :-)

Where did he say he got NXDOMAIN? He said he got no name-server, which
is not a dig error I've ever seen. Maybe he was abbreviating from No
nameserver could be reached, which is what dig says when it times out
waiting for a reply.



___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Name-server redundancy

2014-06-10 Thread Blake Hudson
If you want to ensure well working failover you must, at some point, 
test it. Even better, you may want to regularly test it (check out 
Netflix's Chaos Monkey).


One way to run a simulation would be to use a firewall rule or static 
route to block access between your test client/recursive server and one 
or more of the authoritative DNS servers. However, this is no substitute 
for an actual test to determine how different client applications will 
behave.


--Blake


Sid Shapiro wrote the following on 6/9/2014 4:56 PM:
Again - thanks for the quick response - that'll  teach me to post 
without all the facts. I simply don't remember what the specific error 
was, darn it. It might have been NXDOMAIN or SERVFAIL - I didn't write 
it down.
The test I was running was on a barely, if ever used, domain, so I was 
pretty sure it wasn't cached anywhere.


I'm trying to figure out ways to test this without taking name servers 
offline :-)


--
Sid Shapiro sid_shap...@bio-rad.com mailto:sid_shap...@bio-rad.com
Bio-Rad Corporate IT  - Desk: (510) 741-6846   Mobile: (510) 224-4343



___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Name-server redundancy

2014-06-09 Thread Sid Shapiro
Hello,
I've got 6 name-servers, 2 in each of 3 global regions. Each name-server
has a net connection. Each name-server is authoritative. the domains it
server have all six NS records.

My question has to do with redundancy. If one of my regions goes down, I
would have expected that a query against a domain would reach one of the
other region's name-servers. However, during a maintenance window when one
regions was off the air, I did some simple queries. I did not have a lot of
time to do a lot of detailed testing and tracing. I was simply trying to
see if I could get a query resolved.

What I got, was a no name-server error. I do not have the exact message,
nor the timings. I could see (somehow) that there might be some time-out
issue on the client, but the no name-servers response came pretty quickly.

This doesn't seem like a configuration problem, although I suppose it might
be. It seems more like a misunderstanding how redundancy works at the
domain level.

Have I totally misunderstood a concept here?
Thanks
--
Sid Shapiro
sid_shap...@bio-rad.com
Bio-Rad Corporate IT  - Desk: (510) 741-6846   Mobile: (510) 224-4343
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: Name-server redundancy

2014-06-09 Thread Sid Shapiro
Thanks, Kevin, for your quick reply. In the last few minutes, I've come to
realize that my problem is likely that the domain is only registered with
two name servers - the one which were offline. Even though the zone has 6
NS records, the .com servers probably only know of the ones in the
registration. So registration and DNS not in sync. Silly mistake.

(And FWIW, I *was* using dig, not nslookup)

--
Sid Shapiro
sid_shap...@bio-rad.com
Bio-Rad Corporate IT  - Desk: (510) 741-6846   Mobile: (510) 224-4343


On Mon, Jun 9, 2014 at 2:32 PM, Kevin Darcy k...@chrysler.com wrote:

  Well, you shouldn't be getting an NXDOMAIN just because some of your
 auth servers are off-line, but you could get some query timeouts if
 performance to your failover servers is really bad (or blocked, due to
 firewall rules, bad routes, etc.), or, if your expire times are *really*
 low, and the master's been down a while, it's possible the zone may have
 expired on the slaves.

 In any of those cases, I'm suspecting you're using nslookup, and you might
 be suffering from its horrible misfeature where it searchlists on a query
 failure, and then reports the *last* RCODE it received as the result of the
 entire lookup. So, for example, if your query is www.example.com and your
 searchlist ends in the domain department1.example.com, if the first query
 fails (e.g. with a timeout or a SERVFAIL), nslookup might work through the
 searchlist, ultimately querying www.example.com.department1.example.com,
 which returns NXDOMAIN, and that's what nslookup (mis-)reports as the
 result of the query.

 You can avoid this by dot-terminating the original query (thus inhibiting
 nslookup's searchlist behavior), or even better, using a real DNS
 troubleshooting tool like dig or host. If you want to continue to use
 nslookup, at the very least add the -debug flag so you can see what it's
 really doing under the covers.


 - Kevin

 On 6/9/2014 4:36 PM, Sid Shapiro wrote:

 Hello,
 I've got 6 name-servers, 2 in each of 3 global regions. Each name-server
 has a net connection. Each name-server is authoritative. the domains it
 server have all six NS records.

  My question has to do with redundancy. If one of my regions goes down,
 I would have expected that a query against a domain would reach one of the
 other region's name-servers. However, during a maintenance window when one
 regions was off the air, I did some simple queries. I did not have a lot of
 time to do a lot of detailed testing and tracing. I was simply trying to
 see if I could get a query resolved.

  What I got, was a no name-server error. I do not have the exact
 message, nor the timings. I could see (somehow) that there might be some
 time-out issue on the client, but the no name-servers response came pretty
 quickly.

  This doesn't seem like a configuration problem, although I suppose it
 might be. It seems more like a misunderstanding how redundancy works at the
 domain level.

  Have I totally misunderstood a concept here?
 Thanks
  --
 Sid Shapiro
 sid_shap...@bio-rad.com
  Bio-Rad Corporate IT  - Desk: (510) 741-6846   Mobile: (510) 224-4343


 ___
 Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
 from this list

 bind-users mailing 
 listbind-us...@lists.isc.orghttps://lists.isc.org/mailman/listinfo/bind-users



 ___
 Please visit https://lists.isc.org/mailman/listinfo/bind-users to
 unsubscribe from this list

 bind-users mailing list
 bind-users@lists.isc.org
 https://lists.isc.org/mailman/listinfo/bind-users

___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: Name-server redundancy

2014-06-09 Thread Kevin Darcy
That scenario still shouldn't have led to an NXDOMAIN. If none of the 
delegated nameservers are responding, you'd get a timeout or SERVFAIL. 
So I think there's still some investigation to be done. But using dig 
instead of nslookup at least makes things clearer :-)


Of course, caching may complicate things here. The NS records published 
at the apex (which I assume were all 6 of them) take precedence over the 
delegation NS'es, so for a period of time, some resolvers would be able 
to resolve names in the zone, and some would not. Eventually, depending 
on your TTLs, everyone would expire the cached NS records and the zone 
would be completely unresolvable.


- Kevin

On 6/9/2014 5:38 PM, Sid Shapiro wrote:
Thanks, Kevin, for your quick reply. In the last few minutes, I've 
come to realize that my problem is likely that the domain is only 
registered with two name servers - the one which were offline. Even 
though the zone has 6 NS records, the .com servers probably only know 
of the ones in the registration. So registration and DNS not in sync. 
Silly mistake.


(And FWIW, I *was* using dig, not nslookup)

--
Sid Shapiro sid_shap...@bio-rad.com mailto:sid_shap...@bio-rad.com
Bio-Rad Corporate IT  - Desk: (510) 741-6846   Mobile: (510) 224-4343


On Mon, Jun 9, 2014 at 2:32 PM, Kevin Darcy k...@chrysler.com 
mailto:k...@chrysler.com wrote:


Well, you shouldn't be getting an NXDOMAIN just because some of
your auth servers are off-line, but you could get some query
timeouts if performance to your failover servers is really bad (or
blocked, due to firewall rules, bad routes, etc.), or, if your
expire times are *really* low, and the master's been down a while,
it's possible the zone may have expired on the slaves.

In any of those cases, I'm suspecting you're using nslookup, and
you might be suffering from its horrible misfeature where it
searchlists on a query failure, and then reports the *last* RCODE
it received as the result of the entire lookup. So, for example,
if your query is www.example.com http://www.example.com and your
searchlist ends in the domain department1.example.com
http://department1.example.com, if the first query fails (e.g.
with a timeout or a SERVFAIL), nslookup might work through the
searchlist, ultimately querying
www.example.com.department1.example.com
http://www.example.com.department1.example.com, which returns
NXDOMAIN, and that's what nslookup (mis-)reports as the result of
the query.

You can avoid this by dot-terminating the original query (thus
inhibiting nslookup's searchlist behavior), or even better, using
a real DNS troubleshooting tool like dig or host. If you want to
continue to use nslookup, at the very least add the -debug flag so
you can see what it's really doing under the covers.

- Kevin

On 6/9/2014 4:36 PM, Sid Shapiro wrote:

Hello,
I've got 6 name-servers, 2 in each of 3 global regions. Each
name-server has a net connection. Each name-server is
authoritative. the domains it server have all six NS records.

My question has to do with redundancy. If one of my regions
goes down, I would have expected that a query against a domain
would reach one of the other region's name-servers. However,
during a maintenance window when one regions was off the air, I
did some simple queries. I did not have a lot of time to do a lot
of detailed testing and tracing. I was simply trying to see if I
could get a query resolved.

What I got, was a no name-server error. I do not have the exact
message, nor the timings. I could see (somehow) that there might
be some time-out issue on the client, but the no name-servers
response came pretty quickly.

This doesn't seem like a configuration problem, although I
suppose it might be. It seems more like a misunderstanding how
redundancy works at the domain level.

Have I totally misunderstood a concept here?
Thanks
--
Sid Shapiro sid_shap...@bio-rad.com mailto:sid_shap...@bio-rad.com
Bio-Rad Corporate IT  - Desk: (510) 741-6846
tel:%28510%29%20741-6846   Mobile: (510) 224-4343
tel:%28510%29%20224-4343


___
Please visithttps://lists.isc.org/mailman/listinfo/bind-users  to 
unsubscribe from this list

bind-users mailing list
bind-users@lists.isc.org  mailto:bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users



___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to
unsubscribe from this list

bind-users mailing list
bind-users@lists.isc.org mailto:bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users




___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to 

Re: Name-server redundancy

2014-06-09 Thread Sid Shapiro
Again - thanks for the quick response - that'll  teach me to post without
all the facts. I simply don't remember what the specific error was, darn
it. It might have been NXDOMAIN or SERVFAIL - I didn't write it down.
The test I was running was on a barely, if ever used, domain, so I was
pretty sure it wasn't cached anywhere.

I'm trying to figure out ways to test this without taking name servers
offline :-)

--
Sid Shapiro
sid_shap...@bio-rad.com
Bio-Rad Corporate IT  - Desk: (510) 741-6846   Mobile: (510) 224-4343


On Mon, Jun 9, 2014 at 2:47 PM, Kevin Darcy k...@chrysler.com wrote:

  That scenario still shouldn't have led to an NXDOMAIN. If none of the
 delegated nameservers are responding, you'd get a timeout or SERVFAIL. So I
 think there's still some investigation to be done. But using dig instead of
 nslookup at least makes things clearer :-)

 Of course, caching may complicate things here. The NS records published at
 the apex (which I assume were all 6 of them) take precedence over the
 delegation NS'es, so for a period of time, some resolvers would be able to
 resolve names in the zone, and some would not. Eventually, depending on
 your TTLs, everyone would expire the cached NS records and the zone would
 be completely unresolvable.

 -
 Kevin


 On 6/9/2014 5:38 PM, Sid Shapiro wrote:

 Thanks, Kevin, for your quick reply. In the last few minutes, I've come to
 realize that my problem is likely that the domain is only registered with
 two name servers - the one which were offline. Even though the zone has 6
 NS records, the .com servers probably only know of the ones in the
 registration. So registration and DNS not in sync. Silly mistake.

  (And FWIW, I *was* using dig, not nslookup)

  --
 Sid Shapiro
 sid_shap...@bio-rad.com
  Bio-Rad Corporate IT  - Desk: (510) 741-6846   Mobile: (510) 224-4343


 On Mon, Jun 9, 2014 at 2:32 PM, Kevin Darcy k...@chrysler.com wrote:

  Well, you shouldn't be getting an NXDOMAIN just because some of your
 auth servers are off-line, but you could get some query timeouts if
 performance to your failover servers is really bad (or blocked, due to
 firewall rules, bad routes, etc.), or, if your expire times are *really*
 low, and the master's been down a while, it's possible the zone may have
 expired on the slaves.

 In any of those cases, I'm suspecting you're using nslookup, and you
 might be suffering from its horrible misfeature where it searchlists on a
 query failure, and then reports the *last* RCODE it received as the result
 of the entire lookup. So, for example, if your query is www.example.com
 and your searchlist ends in the domain department1.example.com, if the
 first query fails (e.g. with a timeout or a SERVFAIL), nslookup might work
 through the searchlist, ultimately querying
 www.example.com.department1.example.com, which returns NXDOMAIN, and
 that's what nslookup (mis-)reports as the result of the query.

 You can avoid this by dot-terminating the original query (thus inhibiting
 nslookup's searchlist behavior), or even better, using a real DNS
 troubleshooting tool like dig or host. If you want to continue to use
 nslookup, at the very least add the -debug flag so you can see what it's
 really doing under the covers.


 - Kevin

 On 6/9/2014 4:36 PM, Sid Shapiro wrote:

  Hello,
 I've got 6 name-servers, 2 in each of 3 global regions. Each name-server
 has a net connection. Each name-server is authoritative. the domains it
 server have all six NS records.

  My question has to do with redundancy. If one of my regions goes
 down, I would have expected that a query against a domain would reach one
 of the other region's name-servers. However, during a maintenance window
 when one regions was off the air, I did some simple queries. I did not have
 a lot of time to do a lot of detailed testing and tracing. I was simply
 trying to see if I could get a query resolved.

  What I got, was a no name-server error. I do not have the exact
 message, nor the timings. I could see (somehow) that there might be some
 time-out issue on the client, but the no name-servers response came pretty
 quickly.

  This doesn't seem like a configuration problem, although I suppose it
 might be. It seems more like a misunderstanding how redundancy works at the
 domain level.

  Have I totally misunderstood a concept here?
 Thanks
  --
 Sid Shapiro
 sid_shap...@bio-rad.com
  Bio-Rad Corporate IT  - Desk: (510) 741-6846 %28510%29%20741-6846
 Mobile: (510) 224-4343 %28510%29%20224-4343


  ___
 Please visit https://lists.isc.org/mailman/listinfo/bind-users to 
 unsubscribe from this list

 bind-users mailing 
 listbind-us...@lists.isc.orghttps://lists.isc.org/mailman/listinfo/bind-users



 ___
 Please visit https://lists.isc.org/mailman/listinfo/bind-users to
 unsubscribe from this list

 bind-users 

Re: Name-server redundancy

2014-06-09 Thread Barry Margolin
In article mailman.401.1402350461.26362.bind-us...@lists.isc.org,
 Kevin Darcy k...@chrysler.com wrote:

 That scenario still shouldn't have led to an NXDOMAIN. If none of the 
 delegated nameservers are responding, you'd get a timeout or SERVFAIL. 
 So I think there's still some investigation to be done. But using dig 
 instead of nslookup at least makes things clearer :-)

Where did he say he got NXDOMAIN? He said he got no name-server, which 
is not a dig error I've ever seen. Maybe he was abbreviating from No 
nameserver could be reached, which is what dig says when it times out 
waiting for a reply.

-- 
Barry Margolin
Arlington, MA
___
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users