Aaron S. Joyner wrote:

Jon Carnes wrote:

On Tue, 2006-05-16 at 23:57, Aaron S. Joyner wrote:

Friendly public service announcement (I'm sure Jon knows, but I can't let a statement like the above go by with out responding). Assuming you have some semblance of control over the DNS records themselves, you should lower the TTL before you change the IP (or name) associated with that record, and then raise the TTL again after the change has stabilized. Let's consider a hypothetical scenario. You run a web server, www.example.com. You're going to change providers, and thus change the IP of the machine serving www.example.com. The steps to follow go something like this:

1: Examine the current record, determine how long the TTL is (we'll say it's 3 days, or 10800 seconds). 2: At least one current-TTL-interval (3 days) before you intend to make the change, update the TTL for that record (and all other potentially affected records) to be very low, for example 5 mins (900 seconds). 3: Test the new setup on the new IP, then 'throw the switch' by changing the DNS record. 4: Establish that everything is working as expected, perhaps wait 1 day to be sure. 5: Make a final DNS update to return the TTL to it's previous long / stable value.

This way, your DNS updates can normally have nice long cache times, making your bandwidth bill lower, your user's latency lower, still giving you the ability to have quick change over of service, and making the Internet a healthier place. This makes everyone happy. :)

As an exercise for the reader, how would you handle migrating your DNS server(s) from one IP address (or one subnet) to another, using similar techniques? Do you need to talk to someone outside your organization, or can you do it all in-house? Are you sure of your answer to that last question? How would you find out for sure... :) A Google T-shirt(*) to the person who comes up with the best / most complete answer(+).

Aaron S. Joyner

* - Size of your choice, in white or black:
http://www.googlestore.com/product.asp?catid=5&code=GO0108
http://www.googlestore.com/product.asp?catid=5&code=GO13022

+ - Final decision about answer quality is at my sole discretion, although I promise to be as fair as possible. Credit for information posted will come on a first-come, first-serve basis - ie. if someone posts a 90% complete answer, and you rephrase their answer plus 10% more, unless that 10% is really critical they'll probably be considered to have the better answer. Hence, posting sooner is better, but I'll probably wait either until every angle has been exhausted or at most 5 days. Time differences of less than roughly 2-3 mins in time sent are not considered note-worthy.



Well who could resist that offer... especially since I move folks DNS
servers over to our ISP all the time (and we've never lost a look-up
yet!).

1) On the old servers, set the TTL to 4 hours (14400) or less. Set the
SOA Refresh interval to 20 minutes (3600) if you expect to keep some of
the current secondary NS servers up and running. This tells the
secondaries to check in every 20 minutes for updates.

2) On the new servers, setup the Name info for the domain. Be sure the
SOA is setup properly to reflect the new server. Make sure you list your
new Name servers as DNS entries.

3) Once the new servers are setup and running you can simply go to your
Domain register (GoDaddy.com) and change your Name servers. The change
will take awhile, so you need to get this done a few days to one week
prior to when you want to make the move. We find that 48 hours pretty
much does the trick. A check of the logs indicates if any traffic is
still going to the old servers

... and that is pretty much it unless you are also changing IP ranges.


Check your Name server setup by visting:
 http://www.dnsreport.com
<Trilug does fairly good here - only having one red mark - It's an open
DNS server and these days the Black hat guys can exploit that>


Use the "whois" command to see what your current Name servers are set to
at the Internic:
 Name Server:NS.WAYFARER.ORG
 Name Server:NS2.TRILUG.ORG

Use the command "host -t ns <domain name>" to see what your primary name
server *thinks* your Name servers are... these should agree.
  host -t ns trilug.org
    trilug.org name server ns.wayfarer.org.
    trilug.org name server ns2.trilug.org.


Jon Carnes


I'm really surprised no one else has picked up this thread and run with it. :) Both Jon and Tanner had good answers, so I'll send them both a T-shirt (let me know your size and color preference, privately if you prefer). I'll point out some common misconceptions from their answers, and pose some additional thinking points. Jon and Tanner, give it a day or so before you respond, if you'd like to. :) If things aren't completely fleshed out by Monday evening, I'll try to remember to hit this thread again and tidy up the loose ends.

So here I am again, it's Thursday before I found the time, but that time has come. I'll probably actually make a few replies to individual messages in this thread, but I'll start by breaking with good edict and replying to my own questions, to set out what I was initially thinking. The other replies will be specific comments to side-issues raised by Rick and Tanner. Get your coffee, this one's long, but hopefully it'll be useful. :)

1) whois is not used by DNS in any manner, what so ever. It's used by humans, as a database maintained by the registrars, of contact information for a given domain. If I want to look up the name servers for a domain, I should use host or dig (no, you really shouldn't use nslookup, but it would work :) ). If I want to look up who to contact about that domain, I should use whois. Well, I'd probably trust the email contact listed in the SOA more, but if I needed more traditional contact methods, ala name, phone, address, whois provides that. How would I use host or dig to find out what the delegating entity believes my name servers are, ie. instead of the whois command Jon suggested: `whois trilug.org`?

Ian later pointed that Jon did mention "host -t ns <domain>", but that's not really a complete replacement for whois. The command as cited uses the local resolv.conf on the computer to determine which nameserver to talk to, which may give you a different view than whois. So the replacement for whois which, using DNS instead of the whois protocol, determines what the registrar is providing as NS records for your domain, is something like this:
dig -t ns <domain> @<registar's NS>

So continuing to use trilug.org as the example, we would do this:

$ dig -t ns trilug.org @tld3.ultradns.org

Which gives us this:
; <<>> DiG 9.2.1 <<>> -t ns trilug.org @tld3.ultradns.org
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44012
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 2

;; QUESTION SECTION:
;trilug.org.                    IN      NS

;; AUTHORITY SECTION:
trilug.org.             86400   IN      NS      ns2.trilug.org.
trilug.org.             86400   IN      NS      ns.wayfarer.org.

;; ADDITIONAL SECTION:
ns2.trilug.org.         86400   IN      A       64.244.27.142
ns.wayfarer.org.        86400   IN      A       66.139.75.19

;; Query time: 7 msec
;; SERVER: 199.7.66.1#53(tld3.ultradns.org)
;; WHEN: Thu May 25 22:42:02 2006
;; MSG SIZE  rcvd: 104

From here we can see that tld3.ultradns.org returns two NS records for trilug.org, ns.wayfarer.org and ns2.trilug.org, and it returns two glue records (the respective A records) to help us talk to those servers. This is particularly important in the case of ns2.trilug.org, as with out the glue we would have a chicken-and-the-egg problem of needing to ask ns2.trilug.org what it's IP address is. :)

2) Neither Tanner nor Jon touched on who you actually need to contact to update the information in the "whois" record. There's a good buzzword name for that company or entity, which I'm sure they both know, but neglected to mention directly.

So the word "Registrar" was eventually thrown out, and there was some discussion about registry vs registrar, but all that is quite irrelevant. :) You must use some contact method (used to be a form email, these days it's almost always a website) to provide new information to the registrar, who then updates the name servers *and* the whois registry information.


3) Nobody touched on this fun and interesting angle: Can you do it with out talking to that entity, and what interesting things happen if you try? (hint: this more often happens by accident)

So this was perhaps the original interesting angle I hoped to bring up with all this, which really requires a thorough exploring of how DNS delegations work. Consider the lame delegation scenario that was mentioned in this thread. This is actually a very common problem, and when I joined the steering committee for TriLUG (or perhaps it was shortly before) I found this to be an actual problem with the trilug.org domain. First, you specify ns1.trilug.org and ns1.example.org to your registrar. Then, 2 years go by, and no one pays attention. By then, ns1.example.org has thoroughly forgotten that they were supposed to be a secondary name server for trilug.org. They reinstall the server, take it off line, it's building burns down, whatever. Now, when the registrar hands out NS delegations for your domain, 50% of the time (roughly) people will try to contact ns1.example.org. That server then replies that it's not authoritative for the trilug.org domain, and refers you back to something like a.root-servers.net or perhaps the .org. domain servers. This is a "lame" delegation, because you caused the client to do extra work by providing an invalid delegation, and making it chase a dead end. The client won't totally give up, it will then try the other server returned by the .org. name server. In the current example, that's ns1.trilug.org, which will then give it a valid response for what ever trilug.org name it was looking for.

So, now how can we leverage this to do interesting things? Let's consider the above example again, but assume ns1.example.org is still your friend, and hasn't forgotten about you. You want to move ns1.trilug.org from 3.3.3.3 (your old provider, General Electric) to 4.4.4.4 (your new provider, Level 3). Lets assume you used Cheap Ed's domain registry, and they only accept change requests delivered via carrier-Camel to their headquarters in the middle of the Gobi desert. That's time prohibitive (your contract with GE is up), so you need to do this before the registry can be updated. You setup your new name server at 4.4.4.4, change the A record in the zone file on 4.4.4.4 to point to the new IP (don't forget to bump the serial), then give your friends and example.org a call, ask them to update the named.conf masters entry to point to 4.4.4.4. Now, shut off the nameserver at 3.3.3.3, and things will work dandily. You might incur a couple 10s or at worst 100s of milliseconds of latency on the first lookup anyone does to your site (and every time the TTL on your NS records expires), but it will generally work. Here's why. When the random client attempts to look up www.trilug.org for the first time, it asks the root servers, which direct it to the .org. servers, which return ns1.trilug.org (3.3.3.3) and ns1.example.org (1.1.1.1). Then 50% of the time that client tries the first address and fails to contact it. Then that client (and also the other 50% on the first try) contact ns1.example.org. It returns *new* authority records (and the glue to go with them) which will have a higher or equal TTL, and thus refresh the existing record, for ns1.trilug.org's A record. All future queries will be split 50/50 between the two name servers, and the resolver will have the correct IPs cached for them. These are also likely to stay cached, as every time you look up a record in the trilug.org zone, you get new authority records, and the glue to go with them, so everything stays current on a frequently-used name server. Tada, life with Cheap Ed's domain registry has been made a little more bearable. Once the camel arives in the Gobi desert, and Cheap Ed updates the A record for ns1.trilug.org in the .org. name server, then the riduclousness is resolved, and all queries flow normally and quickly.

If we change the scenario a little, to where you don't have but two nameservers, and both are on the subnet you want to migrate away from (bad bad bad, diversity is good and important), then this whole idea breaks down, and you *have* to change it at the registrar, there is no other way. So the correct answer to my original questions: "Do you need to talk to someone outside your organization, or can you do it all in-house? Are you sure of your answer to that last question? How would you find out for sure..." You have to look and see what the registrar is handing out. If you have other name servers that will continue to be accessible, it's possible (though not at all advisable) to skirt the registrar, either by accident or on purpose. If you don't have any other name servers, it's not possible, and you must talk to the registrar and coordinate with them. Of course, your first step is to find someone to be your secondary, and get them added to the registrar.

For the record, doing this for any length of time is futile, pointless, and bad for the health of the Internet. It's also highly misleading to people who might look at whois, the registrar's data, or at your domain in general. It may be particularly confusing and brain-hurting for people who don't live, eat, and drink DNS on a daily basis. It should only be considered useful for making migrations in strange circumstances, or as an amusing thought exercise, and should help to enlighten the original case, where you have accidentally setup a lame delegation.

4) Neither of them mentioned if any updates would be required to secondary servers?

So as previously mentioned, you need to update the masters {}; statement on the slave, in order to change the IP address of the master. I believe this was eventually covered in the thread.

5) Much attention was given to the SOA, the authoritative name server mentioned in it, and it's TTL. What role do each of these parts play? What do slaves use to determine who to pull the zone from? How do they decide to get a new copy of the zone? What roles does the SOA play to persons other than the secondary?

Hmm... some fun questions here. Let's dissect a SOA record. First, the name. This is the "Start of Authority" record, meaning this describes who is responsible for the domain, and how communication should be handled with relation to humans and other DNS servers. Let's also cover a very important and often misunderstood point right up front. Almost nothing a client resolver does involves the SOA record. It mostly comes into play for humans and other BIND DNS servers. I say "almost" and "mostly" because resolvers (as per RFC 2308) do use the 'minimum' value from the SOA to determine how long they're allowed to cache negative answers (NXDOMAIN). The primary value for this record comes into play for BIND slaves, which use some of the timers we'll talk about in a moment. Let's take trilug.org's SOA as an example:

[EMAIL PROTECTED]:~$ dig +noall +answer soa trilug.org
trilug.org. 86400 IN SOA ns2.trilug.org. hostmaster.trilug.org. 2006051700 7200 3600 2419200 7200

or as it's written in the zone file (/etc/bind/trilug.org) on talon:
@ 1D IN SOA ns2.trilug.org. hostmaster.trilug.org. ( 2006051700 ; serial (YYYYMMDDNN)
                                       2H              ; refresh
                                       1H              ; retry
                                       4W              ; expiry
                                       2H )            ; minimum

The first value (expressed as @ in the file itself, interpreted by BIND) is the domain which this SOA record applies to. The second value is the TTL, in the file it's expressed in short notation, in dig's output (and the packet itself) it's expressed in seconds. Next, we have the class, IN - for an INternet type record, as opposed to a Hesiod or ChaosNet record (more info is left as an exercise to the reader).
Next, the record type, SOA.  That one's pretty self-explanatory.
Then, the MNAME, or the master name, the server that originally supplied this data, ie. the name of the server on which the zone file is stored. Then, the RNAME (the responsible party, you could say). This should be a mailbox where someone responsible for the domain can be reached (used by humans or occasionally automatic-notification scripts). RFC 1035 originally suggested that a hostmaster alias be used, it was later formalized as a recommendation in RFC 1912. The serial comes next, which is used by humans to roughly gage the last time the zone was updated, and by slave DNS servers to determine if it needs to be transfered again. The next value is 'refresh', which is used by slaves to determine how often they should check the SOA again to see if the serial has changed (and thus if they need to transfer a new copy of the zone data). Next comes 'retry', which defines how long they should wait to retry if a 'refresh' operation should fail. Then, 'expiry'. This covers how long a slave should continue to serve authoritative information for the zone, if it still can't contact the master server. Last, we have 'minimum'. This was originally defined in RFC 1035 to be the minimum TTL for resource records in a given zone. In practice, it was always just used as the default, which overrides put in place as necessary. Later, RFC 2308 redefined it to mean the negative caching time of a record. So if your name server provides an NXDOMAIN response (indicating that foo123.trilug.org doesn't exist, for example), then the name server must also include the SOA in the authority section, and provide TTL for the NXDOMAIN should be based on the 'minimum' value in the provided SOA. An example: [EMAIL PROTECTED]:~$ dig +noall +answer +authority foo123.trilug.org @ns2.trilug.org trilug.org. 7200 IN SOA ns2.trilug.org. hostmaster.trilug.org. 2006051700 7200 3600 2419200 7200

Note the lack of an answer (thus, NXDOMAIN), and the provision of the SOA, so we can cache the NXDOMAIN for 7200 seconds (the provided 'minimum', or last column of the SOA).

Yay, now we've bored everyone to death with what the SOA really does. Note that it is not used by slaves to figure out who to talk to (that's only done in their named.conf, via the masters {}; statement), nor is it used for anything else magical that I haven't listed above (or at least not in any sensible implementations).

If a good answer comes along to all of these, I might feel compelled to toss out another T-shirt. If not, I'll be sure to eventually answer all my own questions for the curious minded folks. :)

So Rick provided some good, interesting and insightful discussion. He gets a T shirt too. I still haven't gotten a size from Tanner, so Tanner and Rick - get me some size and color preference info. :)

Aaron S. Joyner

--
TriLUG mailing list        : http://www.trilug.org/mailman/listinfo/trilug
TriLUG Organizational FAQ  : http://trilug.org/faq/
TriLUG Member Services FAQ : http://members.trilug.org/services_faq/

Reply via email to