Thanks Victor!
> Keep in mind that polling for fresh policy (synchronous or not)
> will only happen as part of a mail delivery to the destination
> domain. A quick DNS lookup as part of each delivery works just
> fine. It is far from clear under what conditions an MTA delivering
> a message would choose to contact the HTTPS policy endpoint.
The first statement is not necessarily true. A DNS polling is required (as part
of the delivery) only if the policy is not previously cached. Once the policy
is in cache, a separate process can actually keep the cache in sync. Direct
HTTPS call is made only in this case. Separate process for refresh is one
option, but it can also be an async operation in the mail delivery flow.
> Refreshing all cached destinations once a day seems rather wasteful
> and needlessly slow to notice intra-day changes.> Changes in the DNS id can
> be more timely and are much cheaper to
> detect than changes in the HTTPS resource. I'm reluctant to
> recommend just HTTPS polling for refresh.
One argument against relying on DNS TTL is that TTL can vary from minutes to
days. This can affect the refresh time. Lets say we have a policy with max-age
sets to 24 hours. Assume you want to update it. So you put a new policy policy
in HTTP endpoint, and diligently updates the new policy version id in DNS,
expecting the policy to propagate in 24 hours. If the DNS TTL is 7 days, it
takes much longer than 24 hrs to to propagate. In most cases the policy owner
may not even aware of this TTL. When you take into account that these systems
are often administered by different team or org, this is an additional thing
policy owner needs to be worry about.
Here is my original writeup on this issue for reference (with few minor edits)
:==Use DNS TXT record for initial discovery, and there onwards fetch policies
directly from HTTP endpoint without relying on DNS.
Why this revision?
- We no longer have to deal with ‘id’ mismatch or sync issues.
- Remove the need to update DNS every time we change the policy. Less
implementation complexity, and one less thing to worry about for operations
team.
- As a best practice, and to bring more control, caching and refresh should
be handled in application layer. DNS TTLs are inconsistent and can affect
policy refresh cycle.
- Forced policy fetch when the policy validation fails: Do we depend on DNS
TXT or directly fetch from HTTP endpoint? At the least, this part is not
covered in the current draft.
There are two arguments against directly hitting HTTP endpoint, (1) It is more
expensive for senders to make calls to HTTPS than a DNS (2) Hammering HTTPS
endpoint. Let us discuss some points related to this:
- The initial policy discovery is still based on DNS TXT, so the impact on
sender is minimum.
- Once we have a policy and update DNS, we need to make an HTTPS call to
fetch policy - which is unavoidable.
- Policy refresh: Policies are refreshed in the following two cases (1)
max-age reached (2) Validation failures. Policy will be cached either - just
before expiry or after expiry. Depending on the implementation, it can be an
async job - a cron that refreshes cached policies that are about to expire or
already expired, at regular time intervals). In the case of validation
failures, we need to force a policy fetch to make sure we are using the latest
policy. Relying on DNS alone is not safe because, policy would have changed,
but not DNS record. In such cases the sender never knows until you update DNS.
Chances of making mistakes are high because DNS and HTTP endpoints are
administered by different teams or even different organizations. On top of that
we have to deal with DNS TTLs as well.
One argument against direct HTTPS is to avoid hammering HTTPS endpoint. I think
it is not entirely true. If the policy is changed, the recipient must get a new
policy. Whether we use DNS or not, ultimately we need to hit HTTPS. In the case
of initial discovery, we still use DNS to discover the existence of a policy
for the recipient domain. I would say relying on DNS beyond initial discovery
is unnecessary.
The main benefit with DNS record lookup as part of mail delivery is the ability
to discover a policy change much before the max-age (by polling DNS
frequently). Yes this will avoid hammering HTTP endpoint. But do we need to
worry about checking policy whenever we send emails? My opinion is that
honoring max-age is good enough that upholds the policy contract between sender
and recipient. Ideally the policy should be cached directly from HTTP endpoint
at regular intervals. The refresh interval may range from couple of times a day
to couple of times a week. If required, we can introduce an additional field
‘refresh interval’ in the policy. Apparently, senders discover a policy change
almost instantaneously when the policy validation fails. However this will lead
to 'thundering herd' problem - a side effect of a forced refresh.
Rolling a new policy:Updating a policy leads to validation failures in sender’s
side - that forces a policy refresh. There will a time (upto max-age) period
where we expect to see reports based on both old and new policies. If a forced
refresh fails, the sender may move mail to a retry queue and send it later in
time (after max-age?).
(low priority) One way to handle this overlap is to publish the new policy
upfront and distribute it in the same file. We also need to state that the new
policy will be effective from a UTC date time onwards. That means we need to
start supporting ‘not-before’ and ‘not-after’ date time fields that represents
absolute date and time (in contrast to max-age).
As a side note, we also need to alert the sender when ever a sender discovers a
change related policy such as : (a) A policy change (b) Policy removal/policy
not found, but previously cached (c) Discovery of new policy. This should be a
separate report different from our regular violation report- basically to
report control plane issues.
==
thanks,-binu
From: Viktor Dukhovni <[email protected]>
To: [email protected]
Sent: Thursday, 11 August 2016 4:17 PM
Subject: Re: [Uta] review of mta-sts-01
On Thu, Aug 11, 2016 at 08:19:06PM +0000, Binu Ramakrishnan wrote:
> We appreciate your time and effort reviewing our draft.Lately we had some
> discussions related to policy cache and refresh in GitHub. One proposal
> was not to depend on DNS beyond initial discovery. We have some flow
> diagrams (#72) in the below links that provide some insights to what I'm
> referring to.
> https://github.com/mrisher/smtp-sts/issues/62
Keep in mind that polling for fresh policy (synchronous or not)
will only happen as part of a mail delivery to the destination
domain. A quick DNS lookup as part of each delivery works just
fine. It is far from clear under what conditions an MTA delivering
a message would choose to contact the HTTPS policy endpoint.
Refreshing all cached destinations once a day seems rather wasteful
and needlessly slow to notice intra-day changes.
Changes in the DNS id can be more timely and are much cheaper to
detect than changes in the HTTPS resource. I'm reluctant to
recommend just HTTPS polling for refresh.
--
Viktor.
_______________________________________________
Uta mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/uta
_______________________________________________
Uta mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/uta