Re: BIND 9.7.1 + DLZ + DNSSEC: Possible?
> > My name is Kevin and I'm working with the Argentina ccTLD team to upgrade > our local NS systems and our goal is to load the .ar, .com.ar and > subsequent zones using DLZ. Our other task was to deploy DNSSEC here and > start signing our TLDs, but according to the e-mails I've read (dated > 2006 mostly) it's not very clear if it's already been possible (it's been > 4 years since those e-mails were written). As far as I know, DLZ has not yet been taught to understand DNSSEC. I haven't confirmed this personally, but I'll hazard a guess based on what I've seen of the code. DLZ might be able to provide normal answers and RRSIGs when the name exists, but for NXDOMAIN and NOERROR/NODATA answers, I wouldn't expect it to provide NSEC records correctly in all cases, and I'm sure it would fail with NSEC3. If you're planning to use this for a hidden zone master or some such, where it would only be answering AXFRs, I think it could probably do that. Incidentally, BIND 10 can serve authoritative data from a database back-end; it currently supports SQLite3 and we're planning to add a MySQL data source driver. But it won't be ready for production use for another year or so. -- Evan Hunt -- e...@isc.org Internet Systems Consortium, Inc. ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Verizon Users Can't See Site
- "Hauke Lampe" wrote: > On 14.09.2010 19:32, cybers...@comcast.net wrote: > > > Today I was given access to a Linux box on the Verizon network that > is using their DNS server 71.252.0.12, which is affected by this > problem. > > Your nameserver software is case-sensitive where it should not be: > > dig +norec www-mbclive.mbc.irides.com. @216.250.250.136 > - -> correct answer > > dig +norec www-mbclive.mbc.irides.COM. @216.250.250.136 > - -> NODATA answer > > If Verizon's DNS resolvers use 0x20[1] or modify the character case > in > any way, they cannot find the right answer. > > You should complain to your DNS LB vendor. Their implementation > appears > to be too minimalistic. > > dig +norec version.bind txt ch @216.250.250.136 > ;; Question section mismatch: got version.bind/TXT/IN > ;; connection timed out; no servers could be reached Hauke, my hat's off to you...I think you nailed it...many, many thanks. I will buy you a beer and a trip around Nurburgring the next time I'm in Germany! :^ ) ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Verizon Users Can't See Site
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 14.09.2010 19:32, cybers...@comcast.net wrote: > Today I was given access to a Linux box on the Verizon network that is using > their DNS server 71.252.0.12, which is affected by this problem. Your nameserver software is case-sensitive where it should not be: dig +norec www-mbclive.mbc.irides.com. @216.250.250.136 - -> correct answer dig +norec www-mbclive.mbc.irides.COM. @216.250.250.136 - -> NODATA answer If Verizon's DNS resolvers use 0x20[1] or modify the character case in any way, they cannot find the right answer. You should complain to your DNS LB vendor. Their implementation appears to be too minimalistic. dig +norec version.bind txt ch @216.250.250.136 ;; Question section mismatch: got version.bind/TXT/IN ;; connection timed out; no servers could be reached Hauke. [1] Use of Bit 0x20 in DNS Labels to Improve Transaction Identity http://tools.ietf.org/html/draft-vixie-dnsext-dns0x20 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) iEYEARECAAYFAkyP3pAACgkQKIgAG9lfHFMnlwCfaySh4IgRYz/gxDsRwxdolheH uNsAoL7VdmEZpSJFXn3eNeS0XLT0oHQJ =Le9O -END PGP SIGNATURE- ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: BIND 9.7.1 + DLZ + DNSSEC: Possible?
Sign them offline or out of band using a database trigger to initiate the signing. Your schema might need to change a little though. For a ccTLD, your private key should probably be secure and offline anyway. Zone updates should be reasonably automatable using either the BIND dnssec tools or any of the other toolsets out there.. OpenDNSSEC is the another I¹m familiar with, but there are more. Getting that signed zone back into your database is the trick. I¹m not that familiar with the DLZ backend, but if it can slave a zone from any DNS server, then you set it up as a secondary to whatever is signing your zones. If that can¹t be done, you¹ll need to parse the zone file to insert the records into your live domain table. On 14/09/10 9:46 PM, "Kevin Mai" wrote: > We have an average of around 11 QPS but we update zones daily (our servers > store NS delegations mostly and government sites) so it's a daily task to > approve new domains and update/reload zones. > > We have a good DB infrastructure built in and the fact of having a MySQL > server that can replicate is a good reason to have DLZ as the backend. > > The other issue we face is signing the zone files, as we are looking forward > to harden security and sign the .ar ccTLD and the other TLDs (.com.ar, > .mil.ar, .gov.ar, net.ar, etc). We can sign zone files, but how do we sign > database entries? > > > De: "Scott Haneda" > Para: "Kevin Mai" > CC: bind-users@lists.isc.org > Enviados: Martes, 14 de Septiembre 2010 16:40:05 > Asunto: Re: BIND 9.7.1 + DLZ + DNSSEC: Possible? > > On Sep 14, 2010, at 12:15 PM, Kevin Mai wrote: > >> My name is Kevin and I'm working with the Argentina ccTLD team to upgrade our >> local NS systems and our goal is to load the .ar, .com.ar and subsequent >> zones using DLZ. Our other task was to deploy DNSSEC here and start signing >> our TLDs, but according to the e-mails I've read (dated 2006 mostly) it's not >> very clear if it's already been possible (it's been 4 years since those >> e-mails were written). >> >> For that reason, I'd need to know if anyone has deployed DNSSEC and signed >> zones and then stored those RRSIG, NSEC and DNSKEY records on a MySQL backend >> using DLZ as a way to get those entries dinamically. >> >> I'd really appreciate your replies :) > > I've been dealing with DLZ systems for the better part of a few years now. > Unless something has changed I am not aware of in the last 12 months, I can > offer a few suggestions. > > Make sure you test load. Find the fastest reading DB backend you are > comfortable with. Then performance test it. The load of a medium to heavy > system on the database is significant. > > Doing 1000's of DNS lookups per second on a non DLZ system is generally not > too hard to build out. Doing 1000's of selects on a database, DLZ or not, is > significantly more challenging. > > Keep in mind, 1 lookup generally is not 1 database lookup in DLZ, but will > take a few to get the final answer. > > I find DLZ really shines when you are adding and removing domains often and > need instant access to those changes. If you are not making many changes to > your records, the performance hit is not worth the ease of records management > you gained. > > If reloading named starts to take too long, DLZ will come into play. You will > more than likely want to look at ways of distributing multiple DLZ systems. > > There is a competing product for which I have no experience with. I'm sure you > can find it in google. I would explore the pros and cons of any alternative > system as well as BIND/named standalone, and of course a DLZ backed method. > > I have never had to implement signed zones before. If that data is within the > zone, I see no reason why DLZ would not be able to return the correct > response. -- Kal Feher ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
RE: DNSSEC, views & trusted keys...
This is getting very involved - or I'm getting confused. Maybe both :-) I've tried to work out how this can work, but each solution seems to uncover another question. I don't want to experiment to get to "seems to work", only to find the next problem much later... There doesn't seem to be much description of stub zones in the ARM. I take it that a stub zone will fetch data from the zone using non- recursive queries, but the view can provide recursive service to queries zones served elsewhere? I gather that they contain just an SOA and NS records. Presumeably This means I have to create a new set of zone files for the master - E.g. grep for SOA and self (but not delegating) NS records. How are these maintained? It wouldn't be too bad if the master stub server would grab SOA & NS changes from the full zone & propogate them to the primary copy of the stub zone. But the full zone is in a different view from the stub... If this is to work, these queries would have to be non-recursive for the match-recursive view selection to support it. Since we know that the server is authoritative for each zone, it would seem that the stub should always have a 'masters' clause that points itself (even if the non-stub zone is in fact a slave). Otherwise there's a good chance that resolving a query would go across the wire to some other server, ignoring the local data. But then update-forwarding won't work, will it? It would be helpful if someone expert in all the interactions could trace out the flows (where starts, goes, how destination/view selected) for: o Initializing the stub zones on the master and their replication to the slaves o Adding or removing a nameserver for the full zone (Specifically, how this propagates to the stub) o A client's recursive query o Dynamic update o Zone notifies/refreshes (full and stub) Sorry if I'm being opaque -- though if we expect DNSSEC to be used, I won't be the only person trying to get this work! - This communication may not represent the ACM or my employer's views, if any, on the matters discussed. -Original Message- From: Chris Buxton [mailto:chris.p.bux...@gmail.com] Sent: Saturday, September 11, 2010 22:41 To: Phil Mayers Cc: bind-users@lists.isc.org Subject: Re: DNSSEC, views & trusted keys... On Sep 11, 2010, at 2:34 AM, Phil Mayers wrote: > On 09/10/2010 11:12 PM, Timothe Litt wrote: >> >> So it looks like the new (r-internal) view is starting at the root when it >> resolves -- ignoring what it has data for locally. It sorta works for > > You'll need a: > > zone "name" { > type forward; > forward only; > forwarders { >ips; > }; > }; > > It won't automatically detect that another view contains the zone and redirect it; you have to tell it. Use a stub zone instead of a forward zone, so that the query will actually reach the authoritative view. With a forward zone, the query is recursive, so will be picked up by the recursive view - the view will query itself and not receive an answer. zone "zone.name" { type stub; file "/path/to/recursive-view-data/zone.name"; masters { 127.0.0.1; }; // or whatever the correct IP is to reach the internal view }; Chris Buxton BlueCat Networks ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: BIND 9.7.1 + DLZ + DNSSEC: Possible?
We have an average of around 11 QPS but we update zones daily (our servers store NS delegations mostly and government sites) so it's a daily task to approve new domains and update/reload zones. We have a good DB infrastructure built in and the fact of having a MySQL server that can replicate is a good reason to have DLZ as the backend. The other issue we face is signing the zone files, as we are looking forward to harden security and sign the .ar ccTLD and the other TLDs (.com.ar, .mil.ar, .gov.ar, net.ar, etc). We can sign zone files, but how do we sign database entries? De: "Scott Haneda" Para: "Kevin Mai" CC: bind-users@lists.isc.org Enviados: Martes, 14 de Septiembre 2010 16:40:05 Asunto: Re: BIND 9.7.1 + DLZ + DNSSEC: Possible? On Sep 14, 2010, at 12:15 PM, Kevin Mai < k...@mrecic.gov.ar > wrote: My name is Kevin and I'm working with the Argentina ccTLD team to upgrade our local NS systems and our goal is to load the .ar, .com.ar and subsequent zones using DLZ. Our other task was to deploy DNSSEC here and start signing our TLDs, but according to the e-mails I've read (dated 2006 mostly) it's not very clear if it's already been possible (it's been 4 years since those e-mails were written). For that reason, I'd need to know if anyone has deployed DNSSEC and signed zones and then stored those RRSIG, NSEC and DNSKEY records on a MySQL backend using DLZ as a way to get those entries dinamically. I'd really appreciate your replies :) I've been dealing with DLZ systems for the better part of a few years now. Unless something has changed I am not aware of in the last 12 months, I can offer a few suggestions. Make sure you test load. Find the fastest reading DB backend you are comfortable with. Then performance test it. The load of a medium to heavy system on the database is significant. Doing 1000's of DNS lookups per second on a non DLZ system is generally not too hard to build out. Doing 1000's of selects on a database, DLZ or not, is significantly more challenging. Keep in mind, 1 lookup generally is not 1 database lookup in DLZ, but will take a few to get the final answer. I find DLZ really shines when you are adding and removing domains often and need instant access to those changes. If you are not making many changes to your records, the performance hit is not worth the ease of records management you gained. If reloading named starts to take too long, DLZ will come into play. You will more than likely want to look at ways of distributing multiple DLZ systems. There is a competing product for which I have no experience with. I'm sure you can find it in google. I would explore the pros and cons of any alternative system as well as BIND/named standalone, and of course a DLZ backed method. I have never had to implement signed zones before. If that data is within the zone, I see no reason why DLZ would not be able to return the correct response. -- Scott * If you contact me off list replace talklists@ with scott@ * ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: BIND 9.7.1 + DLZ + DNSSEC: Possible?
On Sep 14, 2010, at 12:15 PM, Kevin Mai wrote: > My name is Kevin and I'm working with the Argentina ccTLD team to upgrade our > local NS systems and our goal is to load the .ar, .com.ar and subsequent > zones using DLZ. Our other task was to deploy DNSSEC here and start signing > our TLDs, but according to the e-mails I've read (dated 2006 mostly) it's not > very clear if it's already been possible (it's been 4 years since those > e-mails were written). > > For that reason, I'd need to know if anyone has deployed DNSSEC and signed > zones and then stored those RRSIG, NSEC and DNSKEY records on a MySQL backend > using DLZ as a way to get those entries dinamically. > > I'd really appreciate your replies :) I've been dealing with DLZ systems for the better part of a few years now. Unless something has changed I am not aware of in the last 12 months, I can offer a few suggestions. Make sure you test load. Find the fastest reading DB backend you are comfortable with. Then performance test it. The load of a medium to heavy system on the database is significant. Doing 1000's of DNS lookups per second on a non DLZ system is generally not too hard to build out. Doing 1000's of selects on a database, DLZ or not, is significantly more challenging. Keep in mind, 1 lookup generally is not 1 database lookup in DLZ, but will take a few to get the final answer. I find DLZ really shines when you are adding and removing domains often and need instant access to those changes. If you are not making many changes to your records, the performance hit is not worth the ease of records management you gained. If reloading named starts to take too long, DLZ will come into play. You will more than likely want to look at ways of distributing multiple DLZ systems. There is a competing product for which I have no experience with. I'm sure you can find it in google. I would explore the pros and cons of any alternative system as well as BIND/named standalone, and of course a DLZ backed method. I have never had to implement signed zones before. If that data is within the zone, I see no reason why DLZ would not be able to return the correct response. -- Scott * If you contact me off list replace talklists@ with scott@ * ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
BIND 9.7.1 + DLZ + DNSSEC: Possible?
Hi, My name is Kevin and I'm working with the Argentina ccTLD team to upgrade our local NS systems and our goal is to load the .ar, .com.ar and subsequent zones using DLZ. Our other task was to deploy DNSSEC here and start signing our TLDs, but according to the e-mails I've read (dated 2006 mostly) it's not very clear if it's already been possible (it's been 4 years since those e-mails were written). For that reason, I'd need to know if anyone has deployed DNSSEC and signed zones and then stored those RRSIG, NSEC and DNSKEY records on a MySQL backend using DLZ as a way to get those entries dinamically. I'd really appreciate your replies :) Many thanks and have a great afternoon! Best Regards, Kevin Mai ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Verizon Users Can't See Site
- "Torsten" wrote: > Am Tue, 14 Sep 2010 08:23:03 +0200 > schrieb Torsten : > > > Am Tue, 14 Sep 2010 05:15:16 + (UTC) > > schrieb cybers...@comcast.net: > > > > > > > > > > > > > > Hello List, > > > > > > > > > > > > I've run into an issue that has me stumped for the time being. > I'm > > > working on a website that is hosted on a delegated subdomain. The > > > site is www-mbclive.mbc.irides.com. The mbc.irides.com subdomain > is > > > delegated to two Barracuda load balancers known as > > > dns1.mbc.irides.com and dns2.mbc.irides.com. > > > > > > > > > > > > DNS seems to work fine for the majority of our users, however, in > > > the past week we've heard from many Verizon FIOS users that they > are > > > unable to visit the site due to resolution issues. One sent in a > dig > > > from his home computer and I was wondering why he doesn't receive > an > > > answer: > > > > > > > > > > > > scott$ dig @71.252.0.12 www-mbclive.mbc.irides.com > > > > > > ; <<>> DiG 9.6.0-APPLE-P2 <<>> @71.252.0.12 > > > www-mbclive.mbc.irides.com ; (1 server found) > > > ;; global options: +cmd > > > ;; Got answer: > > > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62184 > > > ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, > ADDITIONAL: > > > 0 > > > > > > ;; QUESTION SECTION: > > > ;www-mbclive.mbc.irides.com. IN A > > > > > > ;; AUTHORITY SECTION: > > > www-mbclive.mbc.irides.com. 10 IN SOA > dns1.mbc.irides.com. > > > 1. 3600 3600 3600 3600 3600 > > > > > > ;; Query time: 20 msec > > > ;; SERVER: 71.252.0.12#53(71.252.0.12) > > > ;; WHEN: Mon Sep 13 21:31:08 2010 > > > ;; MSG SIZE rcvd: 86 > > > > > > > > > > > > Can anyone tell if there is a DNS issue on our end that may cause > us > > > to not play nice w/ Verizon? This issue just popped up in the > last > > > two weeks. Prior to that time visitors were not complaining. Any > > > assistance is greatly appreciated. > > > > > > > I'm having troubles getting an answer from both dns1.mbc.irides.com > > and dns2.mbc.irides.com for www-mbclive.mbc.irides.com. > > > > A dig query freezes for about 12 seconds before returning an > answer. > > Maybe there's a problem with a misconfigured firewall. > > > > [...@localhost ~]$ traceroute -q 1 dns2.mbc.irides.com > > traceroute to dns2.mbc.irides.com (209.252.251.240), 30 hops max, > 60 > > byte packets 1 10.43.64.254 (10.43.64.254) 0.336 ms > > 2 vl67.cr30.isham.de.easynet.net (194.64.6.252) 0.927 ms > > 3 ge1-5.br2.isham.de.easynet.net (194.64.4.126) 0.695 ms > > 4 ge3-0-2.gr10.isham.de.easynet.net (87.86.71.244) 0.632 ms > > 5 te2-0-0.gr10.ixfra.de.easynet.net (87.86.77.95) 9.862 ms > > 6 ge-5-1-4.edge3.frankfurt1.level3.net (212.162.40.77) 9.964 ms > > 7 vlan79.csw2.Frankfurt1.Level3.net (4.68.23.126) 18.392 ms > > 8 ae-72-72.ebr2.Frankfurt1.Level3.net (4.69.140.21) 10.387 ms > > 9 ae-41-41.ebr2.washington1.level3.net (4.69.137.50) 98.620 ms > > 10 ae-5-5.ebr2.washington12.level3.net (4.69.143.222) 101.159 ms > > 11 ae-6-6.ebr2.chicago2.level3.net (4.69.148.146) 113.618 ms > > 12 ae-22-52.car2.chicago2.level3.net (4.69.138.165) 115.322 ms > > 13 paetec-comm.car2.chicago2.level3.net (4.71.250.34) 115.955 ms > > 14 gi-3-1-0.core01.chcgil01.paetec.net (66.155.191.97) 139.525 ms > > 15 po-4-0-0.core02.rochny01.paetec.net (64.80.253.217) 137.915 ms > > 16 gi-6-0-0.edge02.rochny01.paetec.net (66.155.216.183) 140.368 > ms > > 17 * > > 18 * > > 19 * > > 20 * > > 21 * > > 22 * > > 23 * > > 24 * > > 25 * > > 26 * > > 27 * > > 28 * > > 29 * > > 30 * > > > > > I just noticed that the problem might as well be the very short TTL > of > the NS A Records of 10 seconds. Thanks Torsten, the low TTL's have to do with us using the LB's in a failover environment between two locations. Today I was given access to a Linux box on the Verizon network that is using their DNS server 71.252.0.12, which is affected by this problem. Digs and pings to www-mbclive.mbc.irides.com from this device fail. What can I do to better test and pinpoint the cause of the failure? ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Timeouts and retries on high speed Lans
So the cache servers are HA behind something (F5 LTM, Cisco local director, something else). Are the authoritative servers? It would seem sensible to do the same with them. That way a timeout only occurs if the whole HA cluster is unavailable. You can alleviate even that situation by seeding the cache servers every (TTL-some value) minutes. Or slaving the domain on the cache servers. On 14/09/10 11:34 AM, "Howard Wilkinson" wrote: > I have been working on building out a couple of large data centres and > have been struggling with how to set up the systems so that we get a high > resilience, highly responsive DNS service in the presence of failing > equipment. > > The configuration we have adopted includes a layer of BIND 9.6.x servers > that act as pure name server caches. We have six of these servers in each > data centre paired to provide service on VIPs so that if one of the pair > fails the other cache takes over. > > Our resolv.conf is of the following form. > > search xxx.com yyy.com > nameserver 10.1.1.1 > nameserver 10.1.2.1 > nameserver 10.1.3.1 > options timeout:1 attempts:15 no-check-names rotate > > The name servers are thus on different networks within the DCs. > > Our first problem arises because the timeouts seem to be taken serially on > each server rather than the rotate applying between each name server > request. Is this what I should have expected i.e. a 15 second timeout > before the next server is tried in sequence. > > The second problem we face is that even if we could get a one second > timeout this orders of magnitude too slow for names that should be > resolved within our local name space. In other words for lookups within > the xxx.com and yyy.com domains I would like to see timeouts in the > micro-second range. > > Thinking further about this problem I have been considering whether the > resolver should be multi-threaded or parallelised in some way so that it > tries all fo the servers at once and accepts the first to respond. I have > come to the conclusion that this would be too difficult to make resilient > in the general use of the resolver code, but would make sense if the > lwresd layer is added to the equation. > > Which brings me on to the use of lwresd, this would reduce the incidence > of problems with non-responsive servers in that it would detect and switch > to an alternative server on the first failed attempt. However, this still > means that if lwresd has not detected the down server then we get a stall > in response within the data centre. > > So my questions are: > > 1. Does anybody have any experience in building such systems and > suggestions on how we should tune the clients and servers to make the > system less fragile in the presence of hardware, software and network > failures. > > 2. Is is possible with lwresd as it is written today to get the effect of > precognition - i.e. can I get lwresd to notice that a server has gone down > or has come back up without it needing to be triggered by a resolv > request. > > 3. Does anybody know if I can configure lwresd to expect particular zones > to be resolved within very small windows and use this to fail over to the > next server. > > And for discussion I wonder if there would be room to add to the resolver > code and or lwresd additional options of the form > > options zone-timeout: xxx.com:1usec > > or something similar, whereby the resolver could be told that if the cache > does not respond within this time about that particular zone then it can > be assumed that the server is misbehaving. > > Thank you for your attention > > Regards, Howard. > > ___ > bind-users mailing list > bind-users@lists.isc.org > https://lists.isc.org/mailman/listinfo/bind-users -- Kal Feher ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
RE: Verizon Users Can't See Site
>From our AT&T based network it works but the individual server digs (dns1 & >dns2) were significantly slower than the dig in which I didn't specify a >server. $ dig @dns2.mbc.irides.com www-mbclive.mbc.irides.com ; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_4.2 <<>> @dns2.mbc.irides.com www-mbclive.mbc.irides.com ; (1 server found) ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62957 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;www-mbclive.mbc.irides.com.IN A ;; ANSWER SECTION: www-mbclive.mbc.irides.com. 10 IN A 216.250.250.131 ;; Query time: 52 msec ;; SERVER: 209.252.251.240#53(209.252.251.240) ;; WHEN: Tue Sep 14 07:14:02 2010 ;; MSG SIZE rcvd: 60 (arg: 2) dig @dns1.mbc.irides.com www-mbclive.mbc.irides.com $ dig @dns1.mbc.irides.com www-mbclive.mbc.irides.com ; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_4.2 <<>> @dns1.mbc.irides.com www-mbclive.mbc.irides.com ; (1 server found) ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5716 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;www-mbclive.mbc.irides.com.IN A ;; ANSWER SECTION: www-mbclive.mbc.irides.com. 10 IN A 216.250.250.131 ;; Query time: 24 msec ;; SERVER: 216.250.250.136#53(216.250.250.136) ;; WHEN: Tue Sep 14 07:14:31 2010 ;; MSG SIZE rcvd: 60 $ dig www-mbclive.mbc.irides.com ; <<>> DiG 9.3.6-P1-RedHat-9.3.6-4.P1.el5_4.2 <<>> www-mbclive.mbc.irides.com ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30953 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 0 ;; QUESTION SECTION: ;www-mbclive.mbc.irides.com.IN A ;; ANSWER SECTION: www-mbclive.mbc.irides.com. 10 IN A 216.250.250.131 ;; AUTHORITY SECTION: mbc.irides.com. 86000 IN NS dns2.mbc.irides.com. mbc.irides.com. 86000 IN NS dns1.mbc.irides.com. ;; Query time: 29 msec ;; SERVER: 10.0.4.99#53(10.0.4.99) ;; WHEN: Tue Sep 14 07:14:41 2010 ;; MSG SIZE rcvd: 98 -Original Message- From: bind-users-bounces+jlightner=water@lists.isc.org [mailto:bind-users-bounces+jlightner=water@lists.isc.org] On Behalf Of Torsten Sent: Tuesday, September 14, 2010 2:23 AM To: cybers...@comcast.net Cc: bind-users@lists.isc.org Subject: Re: Verizon Users Can't See Site Am Tue, 14 Sep 2010 05:15:16 + (UTC) schrieb cybers...@comcast.net: > > > > Hello List, > > > > I've run into an issue that has me stumped for the time being. I'm > working on a website that is hosted on a delegated subdomain. The > site is www-mbclive.mbc.irides.com. The mbc.irides.com subdomain is > delegated to two Barracuda load balancers known as > dns1.mbc.irides.com and dns2.mbc.irides.com. > > > > DNS seems to work fine for the majority of our users, however, in the > past week we've heard from many Verizon FIOS users that they are > unable to visit the site due to resolution issues. One sent in a dig > from his home computer and I was wondering why he doesn't receive an > answer: > > > > scott$ dig @71.252.0.12 www-mbclive.mbc.irides.com > > ; <<>> DiG 9.6.0-APPLE-P2 <<>> @71.252.0.12 > www-mbclive.mbc.irides.com ; (1 server found) > ;; global options: +cmd > ;; Got answer: > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62184 > ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 > > ;; QUESTION SECTION: > ;www-mbclive.mbc.irides.com.IN A > > ;; AUTHORITY SECTION: > www-mbclive.mbc.irides.com. 10 IN SOA dns1.mbc.irides.com. > 1. 3600 3600 3600 3600 3600 > > ;; Query time: 20 msec > ;; SERVER: 71.252.0.12#53(71.252.0.12) > ;; WHEN: Mon Sep 13 21:31:08 2010 > ;; MSG SIZE rcvd: 86 > > > > Can anyone tell if there is a DNS issue on our end that may cause us > to not play nice w/ Verizon? This issue just popped up in the last > two weeks. Prior to that time visitors were not complaining. Any > assistance is greatly appreciated. > I'm having troubles getting an answer from both dns1.mbc.irides.com and dns2.mbc.irides.com for www-mbclive.mbc.irides.com. A dig query freezes for about 12 seconds before returning an answer. Maybe there's a problem with a misconfigured firewall. [...@localhost ~]$ traceroute -q 1 dns2.mbc.irides.com traceroute to dns2.mbc.irides.com (209.252.251.240), 30 hops max, 60 byte packets 1 10.43.64.254 (10.43.64.254) 0.336 ms 2 vl67.cr30.isham.de.easynet.net (194.64.6.252) 0.927 ms 3 ge1-5.br2.isham.de.easynet.net (194.64.4.126) 0.695 ms 4 ge3-0-2.gr10.isham.de.easynet.net (87.86.71.244) 0.632 ms 5 te2-0-0.gr10.ixfra.de.easynet.net (87.86.77.95) 9.862 ms 6 ge-5-1-4.edge3.frankfurt1.level3.net (212.162.40.77) 9.964 ms 7 vlan79.csw2.Frankfurt1.Level3.net (4.68.23.126) 18.392 ms 8 ae-72-72.ebr2.Frankfurt1.Level3.net (4.69.140.21) 10.387 ms
Timeouts and retries on high speed Lans
I have been working on building out a couple of large data centres and have been struggling with how to set up the systems so that we get a high resilience, highly responsive DNS service in the presence of failing equipment. The configuration we have adopted includes a layer of BIND 9.6.x servers that act as pure name server caches. We have six of these servers in each data centre paired to provide service on VIPs so that if one of the pair fails the other cache takes over. Our resolv.conf is of the following form. search xxx.com yyy.com nameserver 10.1.1.1 nameserver 10.1.2.1 nameserver 10.1.3.1 options timeout:1 attempts:15 no-check-names rotate The name servers are thus on different networks within the DCs. Our first problem arises because the timeouts seem to be taken serially on each server rather than the rotate applying between each name server request. Is this what I should have expected i.e. a 15 second timeout before the next server is tried in sequence. The second problem we face is that even if we could get a one second timeout this orders of magnitude too slow for names that should be resolved within our local name space. In other words for lookups within the xxx.com and yyy.com domains I would like to see timeouts in the micro-second range. Thinking further about this problem I have been considering whether the resolver should be multi-threaded or parallelised in some way so that it tries all fo the servers at once and accepts the first to respond. I have come to the conclusion that this would be too difficult to make resilient in the general use of the resolver code, but would make sense if the lwresd layer is added to the equation. Which brings me on to the use of lwresd, this would reduce the incidence of problems with non-responsive servers in that it would detect and switch to an alternative server on the first failed attempt. However, this still means that if lwresd has not detected the down server then we get a stall in response within the data centre. So my questions are: 1. Does anybody have any experience in building such systems and suggestions on how we should tune the clients and servers to make the system less fragile in the presence of hardware, software and network failures. 2. Is is possible with lwresd as it is written today to get the effect of precognition - i.e. can I get lwresd to notice that a server has gone down or has come back up without it needing to be triggered by a resolv request. 3. Does anybody know if I can configure lwresd to expect particular zones to be resolved within very small windows and use this to fail over to the next server. And for discussion I wonder if there would be room to add to the resolver code and or lwresd additional options of the form options zone-timeout: xxx.com:1usec or something similar, whereby the resolver could be told that if the cache does not respond within this time about that particular zone then it can be assumed that the server is misbehaving. Thank you for your attention Regards, Howard. ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users