Re: Name-server redundancy
You're right: I misinterpreted no name-server as no such host (aka NXDOMAIN), but actually your explanation makes more sense. - Kevin On 6/9/2014 6:07 PM, Barry Margolin wrote: In article mailman.401.1402350461.26362.bind-us...@lists.isc.org, Kevin Darcy k...@chrysler.com wrote: That scenario still shouldn't have led to an NXDOMAIN. If none of the delegated nameservers are responding, you'd get a timeout or SERVFAIL. So I think there's still some investigation to be done. But using dig instead of nslookup at least makes things clearer :-) Where did he say he got NXDOMAIN? He said he got no name-server, which is not a dig error I've ever seen. Maybe he was abbreviating from No nameserver could be reached, which is what dig says when it times out waiting for a reply. ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Name-server redundancy
If you want to ensure well working failover you must, at some point, test it. Even better, you may want to regularly test it (check out Netflix's Chaos Monkey). One way to run a simulation would be to use a firewall rule or static route to block access between your test client/recursive server and one or more of the authoritative DNS servers. However, this is no substitute for an actual test to determine how different client applications will behave. --Blake Sid Shapiro wrote the following on 6/9/2014 4:56 PM: Again - thanks for the quick response - that'll teach me to post without all the facts. I simply don't remember what the specific error was, darn it. It might have been NXDOMAIN or SERVFAIL - I didn't write it down. The test I was running was on a barely, if ever used, domain, so I was pretty sure it wasn't cached anywhere. I'm trying to figure out ways to test this without taking name servers offline :-) -- Sid Shapiro sid_shap...@bio-rad.com mailto:sid_shap...@bio-rad.com Bio-Rad Corporate IT - Desk: (510) 741-6846 Mobile: (510) 224-4343 ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Name-server redundancy
Hello, I've got 6 name-servers, 2 in each of 3 global regions. Each name-server has a net connection. Each name-server is authoritative. the domains it server have all six NS records. My question has to do with redundancy. If one of my regions goes down, I would have expected that a query against a domain would reach one of the other region's name-servers. However, during a maintenance window when one regions was off the air, I did some simple queries. I did not have a lot of time to do a lot of detailed testing and tracing. I was simply trying to see if I could get a query resolved. What I got, was a no name-server error. I do not have the exact message, nor the timings. I could see (somehow) that there might be some time-out issue on the client, but the no name-servers response came pretty quickly. This doesn't seem like a configuration problem, although I suppose it might be. It seems more like a misunderstanding how redundancy works at the domain level. Have I totally misunderstood a concept here? Thanks -- Sid Shapiro sid_shap...@bio-rad.com Bio-Rad Corporate IT - Desk: (510) 741-6846 Mobile: (510) 224-4343 ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Name-server redundancy
Thanks, Kevin, for your quick reply. In the last few minutes, I've come to realize that my problem is likely that the domain is only registered with two name servers - the one which were offline. Even though the zone has 6 NS records, the .com servers probably only know of the ones in the registration. So registration and DNS not in sync. Silly mistake. (And FWIW, I *was* using dig, not nslookup) -- Sid Shapiro sid_shap...@bio-rad.com Bio-Rad Corporate IT - Desk: (510) 741-6846 Mobile: (510) 224-4343 On Mon, Jun 9, 2014 at 2:32 PM, Kevin Darcy k...@chrysler.com wrote: Well, you shouldn't be getting an NXDOMAIN just because some of your auth servers are off-line, but you could get some query timeouts if performance to your failover servers is really bad (or blocked, due to firewall rules, bad routes, etc.), or, if your expire times are *really* low, and the master's been down a while, it's possible the zone may have expired on the slaves. In any of those cases, I'm suspecting you're using nslookup, and you might be suffering from its horrible misfeature where it searchlists on a query failure, and then reports the *last* RCODE it received as the result of the entire lookup. So, for example, if your query is www.example.com and your searchlist ends in the domain department1.example.com, if the first query fails (e.g. with a timeout or a SERVFAIL), nslookup might work through the searchlist, ultimately querying www.example.com.department1.example.com, which returns NXDOMAIN, and that's what nslookup (mis-)reports as the result of the query. You can avoid this by dot-terminating the original query (thus inhibiting nslookup's searchlist behavior), or even better, using a real DNS troubleshooting tool like dig or host. If you want to continue to use nslookup, at the very least add the -debug flag so you can see what it's really doing under the covers. - Kevin On 6/9/2014 4:36 PM, Sid Shapiro wrote: Hello, I've got 6 name-servers, 2 in each of 3 global regions. Each name-server has a net connection. Each name-server is authoritative. the domains it server have all six NS records. My question has to do with redundancy. If one of my regions goes down, I would have expected that a query against a domain would reach one of the other region's name-servers. However, during a maintenance window when one regions was off the air, I did some simple queries. I did not have a lot of time to do a lot of detailed testing and tracing. I was simply trying to see if I could get a query resolved. What I got, was a no name-server error. I do not have the exact message, nor the timings. I could see (somehow) that there might be some time-out issue on the client, but the no name-servers response came pretty quickly. This doesn't seem like a configuration problem, although I suppose it might be. It seems more like a misunderstanding how redundancy works at the domain level. Have I totally misunderstood a concept here? Thanks -- Sid Shapiro sid_shap...@bio-rad.com Bio-Rad Corporate IT - Desk: (510) 741-6846 Mobile: (510) 224-4343 ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing listbind-us...@lists.isc.orghttps://lists.isc.org/mailman/listinfo/bind-users ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Name-server redundancy
That scenario still shouldn't have led to an NXDOMAIN. If none of the delegated nameservers are responding, you'd get a timeout or SERVFAIL. So I think there's still some investigation to be done. But using dig instead of nslookup at least makes things clearer :-) Of course, caching may complicate things here. The NS records published at the apex (which I assume were all 6 of them) take precedence over the delegation NS'es, so for a period of time, some resolvers would be able to resolve names in the zone, and some would not. Eventually, depending on your TTLs, everyone would expire the cached NS records and the zone would be completely unresolvable. - Kevin On 6/9/2014 5:38 PM, Sid Shapiro wrote: Thanks, Kevin, for your quick reply. In the last few minutes, I've come to realize that my problem is likely that the domain is only registered with two name servers - the one which were offline. Even though the zone has 6 NS records, the .com servers probably only know of the ones in the registration. So registration and DNS not in sync. Silly mistake. (And FWIW, I *was* using dig, not nslookup) -- Sid Shapiro sid_shap...@bio-rad.com mailto:sid_shap...@bio-rad.com Bio-Rad Corporate IT - Desk: (510) 741-6846 Mobile: (510) 224-4343 On Mon, Jun 9, 2014 at 2:32 PM, Kevin Darcy k...@chrysler.com mailto:k...@chrysler.com wrote: Well, you shouldn't be getting an NXDOMAIN just because some of your auth servers are off-line, but you could get some query timeouts if performance to your failover servers is really bad (or blocked, due to firewall rules, bad routes, etc.), or, if your expire times are *really* low, and the master's been down a while, it's possible the zone may have expired on the slaves. In any of those cases, I'm suspecting you're using nslookup, and you might be suffering from its horrible misfeature where it searchlists on a query failure, and then reports the *last* RCODE it received as the result of the entire lookup. So, for example, if your query is www.example.com http://www.example.com and your searchlist ends in the domain department1.example.com http://department1.example.com, if the first query fails (e.g. with a timeout or a SERVFAIL), nslookup might work through the searchlist, ultimately querying www.example.com.department1.example.com http://www.example.com.department1.example.com, which returns NXDOMAIN, and that's what nslookup (mis-)reports as the result of the query. You can avoid this by dot-terminating the original query (thus inhibiting nslookup's searchlist behavior), or even better, using a real DNS troubleshooting tool like dig or host. If you want to continue to use nslookup, at the very least add the -debug flag so you can see what it's really doing under the covers. - Kevin On 6/9/2014 4:36 PM, Sid Shapiro wrote: Hello, I've got 6 name-servers, 2 in each of 3 global regions. Each name-server has a net connection. Each name-server is authoritative. the domains it server have all six NS records. My question has to do with redundancy. If one of my regions goes down, I would have expected that a query against a domain would reach one of the other region's name-servers. However, during a maintenance window when one regions was off the air, I did some simple queries. I did not have a lot of time to do a lot of detailed testing and tracing. I was simply trying to see if I could get a query resolved. What I got, was a no name-server error. I do not have the exact message, nor the timings. I could see (somehow) that there might be some time-out issue on the client, but the no name-servers response came pretty quickly. This doesn't seem like a configuration problem, although I suppose it might be. It seems more like a misunderstanding how redundancy works at the domain level. Have I totally misunderstood a concept here? Thanks -- Sid Shapiro sid_shap...@bio-rad.com mailto:sid_shap...@bio-rad.com Bio-Rad Corporate IT - Desk: (510) 741-6846 tel:%28510%29%20741-6846 Mobile: (510) 224-4343 tel:%28510%29%20224-4343 ___ Please visithttps://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org mailto:bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org mailto:bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to
Re: Name-server redundancy
Again - thanks for the quick response - that'll teach me to post without all the facts. I simply don't remember what the specific error was, darn it. It might have been NXDOMAIN or SERVFAIL - I didn't write it down. The test I was running was on a barely, if ever used, domain, so I was pretty sure it wasn't cached anywhere. I'm trying to figure out ways to test this without taking name servers offline :-) -- Sid Shapiro sid_shap...@bio-rad.com Bio-Rad Corporate IT - Desk: (510) 741-6846 Mobile: (510) 224-4343 On Mon, Jun 9, 2014 at 2:47 PM, Kevin Darcy k...@chrysler.com wrote: That scenario still shouldn't have led to an NXDOMAIN. If none of the delegated nameservers are responding, you'd get a timeout or SERVFAIL. So I think there's still some investigation to be done. But using dig instead of nslookup at least makes things clearer :-) Of course, caching may complicate things here. The NS records published at the apex (which I assume were all 6 of them) take precedence over the delegation NS'es, so for a period of time, some resolvers would be able to resolve names in the zone, and some would not. Eventually, depending on your TTLs, everyone would expire the cached NS records and the zone would be completely unresolvable. - Kevin On 6/9/2014 5:38 PM, Sid Shapiro wrote: Thanks, Kevin, for your quick reply. In the last few minutes, I've come to realize that my problem is likely that the domain is only registered with two name servers - the one which were offline. Even though the zone has 6 NS records, the .com servers probably only know of the ones in the registration. So registration and DNS not in sync. Silly mistake. (And FWIW, I *was* using dig, not nslookup) -- Sid Shapiro sid_shap...@bio-rad.com Bio-Rad Corporate IT - Desk: (510) 741-6846 Mobile: (510) 224-4343 On Mon, Jun 9, 2014 at 2:32 PM, Kevin Darcy k...@chrysler.com wrote: Well, you shouldn't be getting an NXDOMAIN just because some of your auth servers are off-line, but you could get some query timeouts if performance to your failover servers is really bad (or blocked, due to firewall rules, bad routes, etc.), or, if your expire times are *really* low, and the master's been down a while, it's possible the zone may have expired on the slaves. In any of those cases, I'm suspecting you're using nslookup, and you might be suffering from its horrible misfeature where it searchlists on a query failure, and then reports the *last* RCODE it received as the result of the entire lookup. So, for example, if your query is www.example.com and your searchlist ends in the domain department1.example.com, if the first query fails (e.g. with a timeout or a SERVFAIL), nslookup might work through the searchlist, ultimately querying www.example.com.department1.example.com, which returns NXDOMAIN, and that's what nslookup (mis-)reports as the result of the query. You can avoid this by dot-terminating the original query (thus inhibiting nslookup's searchlist behavior), or even better, using a real DNS troubleshooting tool like dig or host. If you want to continue to use nslookup, at the very least add the -debug flag so you can see what it's really doing under the covers. - Kevin On 6/9/2014 4:36 PM, Sid Shapiro wrote: Hello, I've got 6 name-servers, 2 in each of 3 global regions. Each name-server has a net connection. Each name-server is authoritative. the domains it server have all six NS records. My question has to do with redundancy. If one of my regions goes down, I would have expected that a query against a domain would reach one of the other region's name-servers. However, during a maintenance window when one regions was off the air, I did some simple queries. I did not have a lot of time to do a lot of detailed testing and tracing. I was simply trying to see if I could get a query resolved. What I got, was a no name-server error. I do not have the exact message, nor the timings. I could see (somehow) that there might be some time-out issue on the client, but the no name-servers response came pretty quickly. This doesn't seem like a configuration problem, although I suppose it might be. It seems more like a misunderstanding how redundancy works at the domain level. Have I totally misunderstood a concept here? Thanks -- Sid Shapiro sid_shap...@bio-rad.com Bio-Rad Corporate IT - Desk: (510) 741-6846 %28510%29%20741-6846 Mobile: (510) 224-4343 %28510%29%20224-4343 ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing listbind-us...@lists.isc.orghttps://lists.isc.org/mailman/listinfo/bind-users ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users
Re: Name-server redundancy
In article mailman.401.1402350461.26362.bind-us...@lists.isc.org, Kevin Darcy k...@chrysler.com wrote: That scenario still shouldn't have led to an NXDOMAIN. If none of the delegated nameservers are responding, you'd get a timeout or SERVFAIL. So I think there's still some investigation to be done. But using dig instead of nslookup at least makes things clearer :-) Where did he say he got NXDOMAIN? He said he got no name-server, which is not a dig error I've ever seen. Maybe he was abbreviating from No nameserver could be reached, which is what dig says when it times out waiting for a reply. -- Barry Margolin Arlington, MA ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users