I understand the phased roll-out goal, but phased rollout and percentages
are not applicable to the evaluator's task.

I start with an assumption that message sources reflect the character of
the individual or organization that controls the source.   Malicious
traffic comes from malicious people.   Innocuous traffic comes from
non-malicious people.   Determining the identifier which indicates
the responsible people can be a little tricky, but once the responsible
identifier is determined, it is a binary issue

DMARC flags possibly-suspicious senders, not possibly-suspicious messages,
and an evaluator should use it accordingly.

Assume that Example.com is at 70% rollout, which means that of their 10
sources, 7 are DKIM-signing, two are doing SPF-only, and 1 is a non-signing
ESP.

 As an evaluator, I will see some Example.com senders as 100% verified, and
others as 100% unverified.    Applying a 70% rule will not protect
my network from all of the malicious content and will not ensure that all
of the wanted messages are delivered.    The intelligent solution is to
investigate why a source is unverified.   Some of the possible answers:

   - The messages are from a well-known ESP, but not signed.   I trust the
   ESP's process for client enrollment and run-time login, so I don't need to
   worry about impersonation.   The message is legitimate.
   - The messages fail SPF with PERMERROR, because the domain owner tried
   to update his policy but duplicated the record instead of modifying it.
   I see the current source's IP address in both policies, so I don't need to
   worry about impersonation.
   - The messages have an aligned Mail From address, but they produce SPF
   Neutral.  A DKIM signature is always present but can never be verified.
    The host names are from unrecognized domains.   The message has odd
   content.   I judge the sender to be malicious and block all traffic from
   that IP address and DNS domain.

Of course, I should be doing this analysis, and updating my filtering
rules, while Example.Com is in p=NONE mode, so that by the time that they
try p=quarantine pct=0, I already know how to filter their current traffic
correctly.

Doug Foster


On Sat, Sep 9, 2023 at 12:20 PM Murray S. Kucherawy <[email protected]>
wrote:

> I'm not looking to change the WG's mind on this matter, but:
>
> On Sat, Sep 9, 2023 at 3:54 AM Douglas Foster <
> [email protected]> wrote:
>
>> There are many percentages mixed up together in this issue:
>>
>>    - The percentage of domain message sources which provide proper
>>    authentication at origination.
>>    - The percentage of domain messages which originate with proper
>>    authentication.   This is determined by the volume distribution between
>>    sources, which is likely to be variable.
>>    - The  percentage of domain messages which are received with
>>    authentication.   This will be different for each evaluator, depending on
>>    the sources from which those messages originate.  This is also affected by
>>    transit issues.
>>
>> But none of those percentages actually matter.   The one that matters to
>> the evaluator is:
>>
>>    - The conditional probability that an unauthenticated message is
>>    actually from the domain and not from an impersonator.   For this test, 
>> the
>>    denominator depends on the volume of impersonation messages, which is
>>    completely independent of the domain's message volumes.
>>
>>
> This seems to over-complicate the point.  RFC 7489 says that "pct" means:
>
>       Percentage of messages from the Domain Owner's
>       mail stream to which the DMARC policy is to be applied.
>
> It goes on to say report messages are excluded from this test.  It says
> nothing about authentication.  Thus, if I get N messages from example.com,
> and the "pct" value is X, then the DMARC test is applied only to X% of that
> N; the simplest way to do this per-message would be to pick a random number
> between 0 and 1 and if it's greater than X%, the message simply bypasses
> DMARC altogether.
>
> That's how we intended it when we wrote that, and that's how early
> implementations did it.
>
> But maybe this is the lesson: People have inferred lots of different
> things from that rather straightforward definition, so maybe it's more
> ambiguous than we realized all those years ago.
>
> In RFC 7489, we have a domain-provided percentage whose calculation is
>> left undefined.   Whatever the calculation, the result has little relevance
>> to the evaluator's risk assessment.   It is actually harmful to advise
>> evaluators to disposition using the sender's percentage and a random number
>> generator.
>>
>
> How do you figure "harmful"?  The purpose was to enable a graduated
> rollout.  If 1-X% of those messages are outright ignored, what harm is
> being introduced?
>
> -MSK, participating
>
_______________________________________________
dmarc mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dmarc

Reply via email to