Re: [Uta] Updated MTA-STS & TLSRPT

Brotman, Alexander Mon, 09 Oct 2017 11:24:54 -0700

Do we think we really need to allow caching?  From what are we trying to 
protect the backend systems?  Feels like it would be easier to disallow caching 
(or recommend against).  I understand the sense that a smaller company could be 
overwhelmed by an attacker or ill-configured legit sender, but I don’t know how 
much caching will really help there.  The caching server would be similarly 
overwhelmed I would imagine, and be unable to serve policies to any other 
requesting systems.

--
Alex Brotman
Sr. Engineer, Anti-Abuse
Comcast

From: Uta [mailto:[email protected]] On Behalf Of Daniel Margolis
Sent: Thursday, October 05, 2017 3:53 AM
To: [email protected]
Subject: Re: [Uta] Updated MTA-STS & TLSRPT

Well, so this isn't that big a change or that weird, and I think we can mostly 
reconcile here. :)

In section 3.3, we already say this:

   Senders may wish to rate-limit the frequency of attempts to fetch the

   HTTPS endpoint even if a valid TXT record for the recipient domain

   exists.  In the case that the HTTPS GET fails, we suggest

   implementions may limit further attempts to a period of five minutes

   or longer per version ID, to avoid overwhelming resource-constrained

   recipients with cascading failures.
So we are already opening the door to short-term caching-*like* behavior in 
some cases, for basically the reasons you describe (throttling, though in the 
section 3.3. case, to avoid repeatedly hitting an endpoint in case of 
*failure*).

First, we all agree MTAs should not use HTTP caching, since it's redundant and 
confusing.

If we want to allow intermediate proxies to use HTTP caching, they should do so 
only if the cache-control headers allow it, or else their behavior will be 
opaque to the real server.

The real server, if it allows HTTP caching, should probably not allow caching 
of any meaningful period of time--say, 1 minute--and should probably in 
practice make the cache lifetime much shorter than the DNS TTL, since the 
caching then becomes mostly immaterial. (I'm fudging a bit here, but it makes 
operational considerations easier: the HTTP cache lifetime is fetch-time + 
cache lifetime, whereas the DNS cache lifetime is resolve-time + TTL; since 
resolve-time and fetch-time are quite close together, making the DNS TTL longer 
than the HTTP cache lifetime removes the need to seriously consider the HTTP 
caching.)

As Lief said, what's the concrete language here?

"HTTP caching MAY be used by reverse proxies if allowed by the Cache-Control 
headers on the HTTPS endpoint. Hosts who are serving a policy that is delegated 
to by other domains SHOULD limit their cache lifetimes to values under one 
minute; further, when rotating deployed policies, they should consider that the 
new HTTP policy may not be visible to MTAs fetching the delegator domain's 
policy until the HTTP cache lifetime has expired."

Something like that?

On Thu, Oct 5, 2017 at 9:40 AM, Viktor Dukhovni 
<[email protected]<mailto:[email protected]>> wrote:
On Wed, Oct 04, 2017 at 05:39:14PM +0200, Daniel Margolis wrote:

> > So I think that cache control is simply not applicable to the MTA, and
> > there's no need to "prohibit" it as such.
>
> I mean, I agree that it's unlikely an MTA would *want* to do this, but I
> think it's useful that we (already have) said "no honoring
> cache-control".

Agreed, we should keep on saying "no honoring cache-control" for
the MTA, however redundant that may be in practice.

> Are you suggesting we just say that cache-control can be honored up to a
> value of 60s? I think that's fine.

I was thinking that larger values should generally not be published,
but I am open to some client-side (reverse-proxy) limits if you think
that's appropriate.

> But note (as I said above) that caching
> that is less than the max-age can still be problematic; it means that
> someone who sees updated DNS but an old cached HTTP endpoint sees the old
> policy. So the interplay between HTTP caching and DNS TTLs are weird, no?

Good point, after updating the HTTPS policy, one MUST wait at least
the duration of the cache-ttl, before making visible DNS changes.
The same also applies when HTTPS changes are made in some content
management system and take a bit of time to propagate out to the
entire server farm.  The idea is to ensure that the most recent
visible change in the DNS id occurs more recently than the most
recent visible change in the underlying policy.

> > The explicit presence of "max_age" makes possible and *invites* the
> > possibility
> > of using a shorter cache-control lifetime for use-cases like reverse
> > proxies.
> >
>
> OK. So to be clear, is your suggestion just to allow cache-control headers *as
> long as the cache lifetime is less than the policy max_age*? Or up to a
> value of 60s? Or something else?

Something on the order of 60s feels about right to me, it could be
even shorter.  Basically, anything that allows a reverse proxy to
consolidate a high-volume streamm of closely-spaced requests is
useful, and once it is only doing one upstream request every few
seconds, most of the gain is achieved.  There's little benefit to
pushing it to one upstream request every ten minutes or every hour.

So I guess I'd like to see providers offer a cache-ttl of at least
5s and at most 60s, with the latter limit recommended to be enfoced
by the reverse proxy as an upper bound.  The proxy MUST start the
clock from when it *initiates* the upstream connection, not when
it receives the payload, as network delays could otherwise lead to
serving stale data for fresh requests.

A related issue arises for MTAs, if two separate threads are doing
policy retrieval, it would be bad if policy data obtained by one
"thread" that took a long time to arrive displaced more recent data
obtained by another "thread".  And worse if the MTA fails to reliably
associate the TXT id that caused each thread to run with the policy
retrieved by that thread.  That is, it would be bad to squirrel
away the latest observed DNS TXT id in some shared state and then
separately obtain a policy, and at that time associate the policy
with a TXT id that may be a later one obtained by some other thread.

There needs to be a proper causal ordering of policy data and TXT
ids, where an MTA never ends up a TXT id with a policy that was
already replaced when the TXT id was published.

> I think there is some oddity around cache lifetimes greater than the DNS
> TTL, though...should we worry about that? Or just advise against it on the
> grounds that it can result in confusion, but leave it up to deployers if
> they wish to do it?

I don't think that's an issue, provide the TXT id is never visible
before the new policy is in place, and the old has been flushed
from any HTTP caches.  When TXT TTL expires, the client will just
go fetch the latest.  Seeing a slightly stale TXT is never a problem,
what could be a problem is seeing a stale policy in association with
a fresh TXT.

--
        Viktor.

_______________________________________________
Uta mailing list
[email protected]<mailto:[email protected]>
https://www.ietf.org/mailman/listinfo/uta

_______________________________________________
Uta mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/uta

Re: [Uta] Updated MTA-STS & TLSRPT

Reply via email to