On Oct 22, 2014, at 23:03 , Mark Allman <[email protected]> wrote:

> 
> 
> The paper quantifies this cost for .com.  We find that something like 1%
> of the records change each week.  So, while increasing the TTL from the
> current two days to one week certainly sacrifices some possible
> flexibility, in practical terms the flexibility isn't being used.

I think your definition of “used” is flawed.  1% of the records in .com is 
actually over a million delegations.  That seems like quite a lot of use, to 
me.  

But, let’s consider the actual effects of changing TTLs in a delegation-centric 
zone.

Today, the TTL of NS records at the parent are, by and large, ignored by 
resolvers.  As non-authoritative data, it’s my understanding that, in most 
resolvers, the TTL of the parent NS set is overwritten by the TTL of the 
authoritative NS records in the child zone.  Under current normal operation, 
changing the TTL in the TLD would have little or no effect on the frequency of 
queries to the TLD.

If resolvers were to begin paying more attention to the parent TTL, then 
raising the TTL in the TLD zone would cause that 1% of customers (over a 
million a week) to have to wait a week every time they update their NS set for 
the old records to expire.  That’s a significant operational change for zone 
operators.  

It’s even worse for DNSSEC, where the DS record in the parent zone is actually 
authoritative data.  Forcing people to take a minimum of a week to do a 
security roll isn’t going to go over well.  Even the standard one or two days 
that most TLDs use today is too long for some people in this case.

> 
>  - As noted in the paper 93% of the zones see no increase in our
>    trace-driven simulations.  That is, they are accessed by at most one
>    end host per TTL and therefore see no benefit from the shared cache
>    and hence will see the same load regardless of whether it is an end
>    host or a shared resolver asking the questions.

How does this compare to resolvers with one or two (or four) orders of 
magnitude more clients behind them?  You were watching a network with roughly 
clients 100 behind a revolver; this doesn’t seem to be representative of the 
Internet at large, where a very large number of the clients are served by a 
very small number of recursive servers.  Have a look at recent work from Geoff 
Huston.. While I can’t put my hands on the reference at the moment, I seem to 
recall him having data that suggests ~25% of the clients sit behind ~1% of the 
resolvers (I can find the reference[1] that puts 16% of the Internet behind 
Google alone).  That’s a very different world from the one extrapolated from 
100 users behind each resolver.

> 
>  - Or, put differently ... We are not pretending that there is no
>    additional cost at some auth servers.  But, this additional cost
>    does buy us things.  So, it is simply a different tradeoff than we
>    are making now.

It’s externalizing costs, not a trade-off.  One entity is not making a change 
and then gaining some things and losing others.  One entity is making a change 
(e.g. an ISP shutting down its resolver) and gaining a reduction in expenses.  
Meanwhile other entities (e.g. their end users) lose some processor cycles to 
their new private resolvers, and some time to increased RTT due to cache misses 
(see above why I don’t accept that this is not a problem).  A third set of 
entities (authoritative operators) lose quite a lot to a significant increase 
in operational costs.  

> - There is also a philosophical-yet-practical argument here.  That is,
>    if I want to bypass all the shared resolver junk between my laptop
>    and the auth servers I can do that now.  And, it seems to me that
>    even given all the arguments against bypassing a shared resolver
>    that should be viewed as at least a rational choice.  So, in this
>    case the auth zones just have to cope with what shows up.  So, do we
>    believe that it is incumbent upon (say) AT&T to provide shared
>    resolvers to shield (say) Google from a portion of the DNS load?
>    Or, put differently, the results in the paper suggest that there
>    really isn't much for AT&T to gain from providing those resolvers,
>    so why should it?  One argument here could be that AT&T is trying to
>    provide its customers better performance.  But, the paper shows this
>    is really not happening (which is largely a function of pervasive
>    DNS prefetching).  So, if I am AT&T I'd be thinking "hey, what am I
>    or my customers actually gaining from this complexity I have in my
>    network?!".  And, if the answer is little-to-nothing then it seems
>    rational to not provide this service.  Or, so it seems to me.

It doesn’t look to me like your paper has done anything to capture what it 
looks like behind AT&T’s resolvers, so I’m not sure how you can come to that 
sort of conclusion.  In 2012, AT&T had around 107 million mobile users alone (I 
found that number[2] more easily than their home Internet users, so I’m using 
that).  I can guarantee you AT&T isn’t running a million separate recursive 
resolvers.  It’s easily in the very-low-thousands of servers (likely less than 
1000 country-wide).  The cache hit/miss ratios in that environment are entirely 
different from your study, but far more representative of the average user’s 
experience.

I actually support end-users having their own iterative resolver, as it fixes 
all of the last-mile problems in DNSSEC validation.  However, the benefit of 
the shared cache cannot be denied .. or, at least, it hasn’t been denied by 
this study.


[1]: Measuring DNSSEC, Geoff Huston 
     <http://www.potaroo.net/presentations/2014-06-03-dns-measurements.pdf>

[2]: 
<http://www.fiercewireless.com/special-reports/grading-top-10-us-carriers-fourth-quarter-2012>



_______________________________________________
dns-operations mailing list
[email protected]
https://lists.dns-oarc.net/mailman/listinfo/dns-operations
dns-jobs mailing list
https://lists.dns-oarc.net/mailman/listinfo/dns-jobs

Reply via email to