Re: [Uta] review of mta-sts-01

Binu Ramakrishnan Thu, 11 Aug 2016 17:03:07 -0700

Thanks Victor!
> Keep in mind that polling for fresh policy (synchronous or not)
> will only happen as part of a mail delivery to the destination
> domain.  A quick DNS lookup as part of each delivery works just
> fine.  It is far from clear under what conditions an MTA delivering
> a message would choose to contact the HTTPS policy endpoint.

The first statement is not necessarily true. A DNS polling is required (as part 
of the delivery) only if the policy is not previously cached. Once the policy 
is in cache, a separate process can actually keep the cache in sync. Direct 
HTTPS call is made only in this case. Separate process for refresh is one 
option, but it can also be an async operation in the mail delivery flow. 
> Refreshing all cached destinations once a day seems rather wasteful
> and needlessly slow to notice intra-day changes.> Changes in the DNS id can 
> be more timely and are much cheaper to
> detect than changes in the HTTPS resource.  I'm reluctant to
> recommend just HTTPS polling for refresh.

One argument against relying on DNS TTL is that TTL can vary from minutes to 
days. This can affect the refresh time. Lets say we have a policy with max-age 
sets to 24 hours. Assume you want to update it. So you put a new policy policy 
in HTTP endpoint, and diligently updates the new policy version id in DNS, 
expecting the policy to propagate in 24 hours. If the DNS TTL is 7 days, it 
takes much longer than 24 hrs to to propagate. In most cases the policy owner 
may not even aware of this TTL. When you take into account that these systems 
are often administered by different team or org, this is an additional thing 
policy owner needs to be worry about.
Here is my original writeup on this issue for reference (with few minor edits) 
:==Use DNS TXT record for initial discovery, and there onwards fetch policies 
directly from HTTP endpoint without relying on DNS.
Why this revision?    
   - We no longer have to deal with ‘id’ mismatch or sync issues.
   - Remove the need to update DNS every time we change the policy. Less 
implementation complexity,  and one less thing to worry about for operations 
team.
   - As a best practice, and to bring more control, caching and refresh should 
be handled in application layer. DNS TTLs are inconsistent and can affect 
policy refresh cycle.
   - Forced policy fetch when the policy validation fails: Do we depend on DNS 
TXT or directly fetch from HTTP endpoint? At the least, this part is not 
covered in the current draft.

There are two arguments against directly hitting HTTP endpoint, (1) It is more 
expensive for senders to make calls to HTTPS than a DNS (2) Hammering HTTPS 
endpoint. Let us discuss some points related to this: 

   - The initial policy discovery is still based on DNS TXT, so the impact on 
sender is minimum.
   - Once we have a policy and update DNS, we need to make an HTTPS call to 
fetch policy - which is unavoidable.
   - Policy refresh: Policies are refreshed in the following two cases (1) 
max-age reached (2) Validation failures. Policy will be cached either - just 
before expiry or after expiry. Depending on the implementation, it can be an 
async job - a cron that refreshes cached policies that are about to expire or 
already expired, at regular time intervals). In the case of validation 
failures, we need to force a policy fetch to make sure we are using the latest 
policy. Relying on DNS alone is not safe because, policy would have changed, 
but not DNS record. In such cases the sender never knows until you update DNS. 
Chances of making mistakes are high because DNS and HTTP endpoints are 
administered by different teams or even different organizations. On top of that 
we have to deal with DNS TTLs as well.

One argument against direct HTTPS is to avoid hammering HTTPS endpoint. I think 
it is not entirely true. If the policy is changed, the recipient must get a new 
policy. Whether we use DNS or not, ultimately we need to hit HTTPS. In the case 
of initial discovery, we still use DNS to discover the existence of a policy 
for the recipient domain. I would say relying on DNS beyond initial discovery 
is unnecessary. 
The main benefit with DNS record lookup as part of mail delivery is the ability 
to discover a policy change much before the max-age (by polling DNS 
frequently). Yes this will avoid hammering HTTP endpoint. But do we need to 
worry about checking policy whenever we send emails? My opinion is that 
honoring max-age is good enough that upholds the policy contract between sender 
and recipient. Ideally the policy should be cached directly from HTTP endpoint 
at regular intervals. The refresh interval may range from couple of times a day 
to couple of times a week. If required, we can introduce an additional field 
‘refresh interval’ in the policy. Apparently, senders discover a policy change 
almost instantaneously when the policy validation fails. However this will lead 
to 'thundering herd' problem -  a side effect of a forced refresh.
Rolling a new policy:Updating a policy leads to validation failures in sender’s 
side - that forces a policy refresh. There will a time (upto max-age) period 
where we expect to see reports based on both old and new policies. If a forced 
refresh fails, the sender may move mail to a retry queue and send it later in 
time (after max-age?).
(low priority) One way to handle this overlap is to publish the new policy 
upfront and distribute it in the same file. We also need to state that the new 
policy will be effective from a UTC date time onwards. That means we need to 
start supporting ‘not-before’ and ‘not-after’ date time fields that represents 
absolute date and time (in contrast to max-age).
As a side note, we also need to alert the sender when ever a sender discovers a 
change related policy such as :  (a) A policy change (b) Policy removal/policy 
not found, but previously cached (c) Discovery of new policy. This should be a 
separate report different from our regular violation report- basically to 
report control plane issues.
==
thanks,-binu

      From: Viktor Dukhovni <[email protected]>
 To: [email protected] 
 Sent: Thursday, 11 August 2016 4:17 PM
 Subject: Re: [Uta] review of mta-sts-01

On Thu, Aug 11, 2016 at 08:19:06PM +0000, Binu Ramakrishnan wrote:

> We appreciate your time and effort reviewing our draft.Lately we had some
> discussions related to policy cache and refresh in GitHub. One proposal
> was not to depend on DNS beyond initial discovery. We have some flow
> diagrams (#72) in the below links that provide some insights to what I'm
> referring to. 
> https://github.com/mrisher/smtp-sts/issues/62

Keep in mind that polling for fresh policy (synchronous or not)
will only happen as part of a mail delivery to the destination
domain.  A quick DNS lookup as part of each delivery works just
fine.  It is far from clear under what conditions an MTA delivering
a message would choose to contact the HTTPS policy endpoint.

Refreshing all cached destinations once a day seems rather wasteful
and needlessly slow to notice intra-day changes.

Changes in the DNS id can be more timely and are much cheaper to
detect than changes in the HTTPS resource.  I'm reluctant to
recommend just HTTPS polling for refresh.

-- 
    Viktor.

_______________________________________________
Uta mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/uta

_______________________________________________
Uta mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/uta

Re: [Uta] review of mta-sts-01

Reply via email to