Re: [Waitman] Re: dnsbl lists Was: my plugin - comments requested

James Craig Burley Thu, 03 Jun 2004 13:29:11 -0700

>James Craig Burley wrote:
>
>> Do the local caches depend on locality of reference to achieve high
>> hit rates?
>> 
>> If so, can you be *sure* that all SPF lookups for incoming emails will
>> exhibit sufficient locality of reference to ensure those hit rates
>> remain high?
>
>You are looking at DNS from the "consumer" point of view rather than the 
>"provider" point of view (which is where I am coming from).  By setting 
>a reasonable TTL on my DNS records, I am very confident that _my_ 
>authoritative servers will not be hit by every yahoo with a resolver 
>more than X percent of the time.  The precise value of X depends on both 
>how popular my domains are and how long a TTL I have assigned.  I can 
>ensure that _my_ e-mail won't be delayed due to timeouts for SPF records.


Which email won't be delayed -- your outgoing or incoming email?

Your outgoing email certainly can be delayed by SPF, if external
servers (over which, and over whose DNS caches, you have no control)
rely on SPF to indicate whether your email is forged.

Your incoming email can be delayed if you rely upon SPF to tag or
discard it, unless you are certain that your DNS lookup latencies all
the way to the Root servers are low enough to avoid those problems.

(Also, I'm not sure about this, but it seems to me that TTLs for DNS
records pertaining to SPF can only be so long before they actually
prevent users from taking advantage of the flexibility and mobility
offered by today's and tomorrow's technologies.  As an extreme
example, check the TTL for jcb-sc.com; since I'm dynIP-hosted, my
dynIP DNS provider has to be prepared for a short-term change.  You
might have an extremely static user base; not everyone does, and
progress promises *more* mobility, not less, over time.)

>> If not, can you be *sure* that latencies resulting from cache misses
>> will not, due to overloaded upstream DNS servers, result in email
>> deliveries being delayed or even failing due to persistent delays?
>
>Any SPF query for a second level .COM domain should normally take 3 
>queries to different servers to resolve:
>
>> $ dnstracer -q txt -os . rowman.com
>> Tracing to rowman.com via A.ROOT-SERVERS.NET, timeout 15 seconds
>> A.ROOT-SERVERS.NET [.] (198.41.0.4)
>>  |\___ M.GTLD-SERVERS.NET [com] (192.55.83.30)
>>  |     |\___ b.ns.rowman.com [rowman.com] (12.151.2.99) Got authoritative answer
>>  |      \___ a.ns.rowman.com [rowman.com] (12.38.22.1) Got authoritative answer
>
>(more queries would be involved only for subdomains, or for CC TLD's, 
>which might have intervening servers).  The root servers are pretty 
>resiliant, and even the COM servers are heavily redundant, so that 
>leaves only the last leaf node(s) as the source of any possible latency.

So, SPF queries involve only second-level (or higher) domain names??

>> If not, what prevents spammers from targeting this weakness with an
>> attack that culminates in a victim site disabling SPF, just as they've
>> attacked system(s) offering C/R for popular mailhosting sites (like
>> hotmail.com)?
>
>I don't understand what weakness you see, so I cannot argue against it.

A spammer uses a large number of zombie machines to inject emails into
your system.  Each email is forged such that your system has to
continuously perform SPF lookups for nonexistent or irrelevant domain
names.  The spammer thus attacks your DNS cache and/or lookup
latencies.

A simple countermeasure is to disable SPF lookups for the duration of
such an attack, of course.

>> Nope.  No need to do rDNS lookups to accept incoming emails.  Since I
>> turned that (and IDENT) off, incoming bounces of joe jobs from, e.g.,
>
>Good for you. ;)  You are missing my point in that I (and I'd guess the 
>vast majority of the people on this list) am making multiple DNS queries 
>for each inbound transaction, to no ill effect (most of the time).

As I already pointed out, their lookup keys are presumably IP
addresses, not even-more-arbitrary-and-easily-forged domain names.

And I believe many people have already experienced unacceptable delays
processing incoming email as a result of employing such checks.

>I am 
>only interested in making sure that my outbound mail proceeds apace, 
>which I can control by the appropriate use of _my_ SPF records.

Great, except you can't really control how quickly your outbound email
is accepted by an external server that might, itself, be under attack
in order to get it to stop checking SPF records (for your site as well
as others).

In other words, even if *you* ignore SPF (other than publishing
records), SPF can hurt your outbound email pretty seriously, if it
becomes a target for attack (on external servers accepting your
outgoing email).

(In a sense, SPF is a sort of mild form of Challenge/Response, which
has had proven failings intrinsic to its nature.  Instead of
challenging a given email address, it challenges the domain name of
the address to show, via DNS info, that its actual source is
considered to be a permitted source for outbound email.  It is
therefore more lightweight and, yet, less flexible than C/R, at least
in principle.)

(I haven't touched on SRS at all yet.  Lots of knowledgeable people
consider SPF unusable because of its relationship to SRS.)

>I could 
>care less, in practice, if inbound mail takes 1 second, 10 seconds, or 
>10 minutes.

That's not true for everyone; I happen to prefer immediate acceptance
of email with a higher rate of spam over long-delayed email with a
lower rate, so I employ no expensive anti-UBM measures.

SPF therefore is likely to be of little use to me, deployed as an
automated measure for tagging or discarding inbound email that is.

(*Publishing* SPF records would be useful to me.  Because of a
combination of unsupported TXT records in my DNS server and bug(s) in
BIND caches, I've had some systems reject me legit outgoing email
because they think I'm publishing SPF records that say I do not permit
it from my own site, or something like that.)

>If I had a much higher level of inbound traffic, I could simply add MX 
>servers until I reduced the load to acceptable levels of my own 
>choosing.  The MX servers aren't doing anything else, so I could care 
>less if they are spending 90% of their time waiting for DNS queries to 
>return, as long as they aren't saturated and dropping connections.

You're combating a potentially exponential problem with mere multiples
of resources.

It's like you're being told the Traveling Salesman Problem (TSP) takes
longer to solve as it scales up, and you're saying "not a problem, I
can simply add more processors".

SPF isn't NP-complete, but it does expose a data base to an
unacceptably high rate of arbitrary, possibly hostile, queries that
are actually made by trusted sources (SMTP servers for trustworthy
hosts), albeit based on input from untrusted sources.

>> Yes, that's true.  So, assuming someone trusts every connection
>> between Root and your particular place in the heirarchy, they can
>> trust you and your SPF records.
>
>The only "trust" involved is the same as every other DNS transaction: 
>follow the chain of delegations down from the root.  The only server 
>which is authoritative for *my* SPF records is *my* server.  If a user 
>trusts their resolving cache, the cache will follow the chain and get 
>the appropriate answer, just as if they looked up the MX record.
>
>If they are running one of the BIND versions (or M$loth's NT4 resolver) 
>which was susceptable to a poison cache exploit, they are screwed in any 
>case.  The SPF is the least of their worries.

Well, yes, but I'm referring to something else anyway.

>> But if they don't trust *any* connection in that chain, they must
>> *not* trust you or your SPF records.
>
>This isn't like a chain of authority in trusted certificates, where one 
>untrusted link makes the chain untrustworthy.  In DNS, _every_ query 
>starts at the root servers and works down; there is always a clear chain 
>of delegations, and as long as everything is configured properly, the 
>query will be answered only by the server which is authoritative for 
>that portion of the address space.  If everything is not configured 
>correctly, you won't get any answer (lame server).  NOTE: here I am 
>talking about your caching resolver (which will walk the chain IFF it 
>doesn't already have an answer in its cache).

Again, you're basically correct here, except...

>> (That is, one cannot trust the SPF records for a.b.c.d.e if one cannot
>> trust c.d.e to publish proper and trustworthy DNS info for itself.)
>
>Again, you are missing the important feature of DNS that c.d.e must 
>proactively delegate subdomains for them to be valid.  No matter what 
>a.b.c.d.e might want to publish, no one is going to get there to find 
>out what that record contains if c.d.e doesn't delegate that range to 
>another server.

...which is exactly the point, that you have to trust that c.d.e
cannot and will not make delegations to untrustworthy entities.

Since the whole *point* of this exercise is to determine
trustworthiness, we must *assume* that there will be cases of "partial
trust" -- you might trust c.d.e's *own* published DNS information, but
not *all* of the delegations it makes to its own subdomains.

Therefore, for the system to work and communicate trust, it must be
capable of determine the trustworthiness of delegations to each
subdomain in the chain.

Now, that's true in general, of course, but a random user on your
system is unlikely to initiate contact to a.b.c.d.e on his or her own,
whereas your system will, in response to an incoming email specifying
a.b.c.d.e as the originating host, assume that the email is *not*
forged if it is given SPF records for a.b.c.d.e permitting emails from
that source.

In other words, SPF will, in cases like this, make email appear to be
more trusted than email normally does today, when in fact it should be
*less* trusted.

A crucial element of usability is at issue here: if you deploy a
system that proposes to distinguish trustworthiness (e.g. whether
something is forged), then false positives (falsely denoting as
trustworthy something that isn't) will have *much* more negative
consequences than never deploying the system in the first place,
because your end users will *assume* the system you deployed is
faithful.

>For giggles sake, try these two queries:
>
>       dig login.oscar.aol.com
>       dig @12.38.22.1 login.oscar.aol.com

I'm in no mood for giggles, dammit!

(Just kidding...will try it later.  ;-)

>> Put another way: are you sure you will be able to trust *all* SPF
>> records published in the .cn domain?  The .ru domain?  The .biz
>> domain?
>
>No, but I can make sure that someone claiming to be sending from 
>jcb-sc.com is employing a server that the administrator of that domain 
>has authorized to send from.  SPF is not intended to prevent spam; it is 
>intended to prevent forgery.  It is just one more tool in the arsenel. 
>I could very well only trust SPF for 2nd level domains only (third level 
>for CCTLD's).

What your explanation tells me is that, for sites close enough to the
Root servers, SPF can be employed over a limited scope (as you say,
2nd/3rd-level domains only) to combat forgeries.

But, you can therefore only combat forgeries of emails supposedly sent
from within that limited scope (which is, for all practical purposes,
infinite in size anyway) to those who, like you, are lucky enough to
be close enough to Root (and have enough resources to throw arond) to
not be susceptible to the forms of attack I'm concerned about.

Where does this leave the rest (say, the other 95%) of us?

Apparently, with a technology (SPF) that is easily attacked and that
does not necessarily even offer us the same ability to inject emails
into this "web of trust" that involves the limited scope and
Root-closeness enjoyed by the privileged few.


Since you presumably run your own MTAs right now, you can try some
fairly simple and nondisruptive tests to gauge whether my hypothesis
might hold water.

First, do some measurements of your DNS lookup latencies on machines
sharing the pertinent DNS cache(s) with your SMTP servers, and gin up
some numbers representing latencies versus rates of incoming emails or
similar.

Second, make sure you do a (possibly useless) DNS lookup for every
incoming SMTP connection based on the envelope sender's host name (or
whatever SPF uses; I think that's the one).  E.g. a TXT lookup,
prepended with "useless.DNS.lookup.", something like that.

Third, redo the first step, and compare the latency-to-rate figures to
see what effect the single additional lookup has.

Fourth, simulate a random-DNS attack.  E.g. write and run a perl
script to randomly invent host names and look up each name in your DNS
(again, a la SPF) at a frequency roughly equal to your actual incoming
email rates.

(Or, perhaps better yet, prepend a short random string to the domain
name being looked up a la SPF in the second step, above, and do the
lookup at the same time as you do for each incoming email, so it
precisely conforms to the incoming email rate.)

Fifth, redo the first step, and look at how your latency-to-rate
figures have changed.

Sixth, repeat the fourth and fifth steps, each time doubling the rate
of the random lookups (each time a new random name is created), until
you reach a point you believe would represent saturation of your
incoming SMTP pipes anyway.

Now, if you observe substantial drops in performance, you can see the
problem beginning to manifest itself.

If you don't, then either I am full of baloney, or you haven't reached
the tipping point of exploitation (and you might never do so, if
you're close enough to the Root servers, among other things).

-- 
James Craig Burley
Software Craftsperson
<http://www.jcb-sc.com>

Re: [Waitman] Re: dnsbl lists Was: my plugin - comments requested

Reply via email to