Anecdotal example - one domain I've worked with in the past does not have very high mail-flow, probably ~1-10k legitimate emails a month to recipients whose choice email systems participate in feedback reporting. Due to this domain's industry, it was highly coveted by spammers spoofing the domain, to the order of 10-100x their monthly legitimate mail volume in terms of reporting.
In this case, traditional data metrics (such as aggregate % of emails that pass/failed DMARC) becomes diluted by the amount of failure reports from illegitimate sources, rendering this type of information much less helpful to the domain owner.
This raises another question; Hypothetically speaking, let's assume an alternate implementation of the DMARC RFC requires a feedback reporter only send reports to a feedback endpoint in the event of SPF/DKIM alignment/authentication failure, instead of both success and failure reports. With the lack of evidence (success reports) to gauge DMARC pass rates, would this inherent assumption that a domain starts at "100% DMARC compliance" be more useful in terms of data analysis by aggregators using this data? (i.e. It is assumed a domain is already compliant, and failure reports are measured against this assumed compliance metric.)
Addressing unaligned signatures, a domain owner could use these to infer context of a particular mail-flow's handling or origin/purpose, but this is based off the assumption that said signature selector name(s) or domain(s) are relevant to the owner's (perhaps previously unknown) uses.
An example being Microsoft 365's default DKIM signing domain (<tenant-name>.onmicrosoft.com) for tenants with DKIM unconfigured. Another example might be a domain owner wanting to determine if their mail flow is being signed correctly or not from particular mail infrastructure. But the absence of reporting on unaligned signatures could also be misleading; an owner would not know if the message wasn't signed at all, or the signature(s) just didn't align for some reason.
On 12/11/2022 2:21 PM, Douglas Foster wrote:
I would not want to use randomization or percentages to discard actionable data.,1) When to send reports.An actionable result is one which says "this server sent a message without a verifiable and aligned DKIM signature". This applies because: - Any message can be subject to forwarding, so any attempt to move toward "reject" implies a need to put DKIM signatures on every message. - SPF results are overridden if DKIM is verified and aligned, because a perfectly formed message can be SPF FAIL if forwarded without MailFrom rewrite, or SPF Not Aligned if forwarded with MailFrom rewrite.A report which has only DKIM PASS results can be called a "success report". It does not provide actionable data, and is therefore unnecessary. However, a system which never receives a report is at risk of undetected configuration errors, so it becomes necessary to send occasional success reports to protect against this risk. We could accomplish this with a SHOULD rule to send a success report, on a weekly basis, to X% of domains that had only success results. The success reports will also allow the domain owner to identify and correct SPF policy errors, if he has them.2) What to include in reportsI have one reporting source that always reports a message count of 1, without regard to the number of messages that I sent and he received. This helped me realize that there is no need to report quantity. A correctly configured server will apply a correct signature on every message. Whether the problem is uniform or random, all that the domain owner needs to know is that a particular server is not signing correctly.And as I have said before, collecting every signature adds unnecessary complication to the reporting process, while adding no value to the domain owner. All that needs to be reported is one aligned signature, because the domain owner's server only needs to apply one aligned signature.These changes would reduce the overhead reporting, especially for smaller organizations where the effort is not noise level. They would also reduce the risk of unwanted data leakage.But I am willing to be convinced. Can someone explain how success reports, message counts, or unaligned signatures serve a domain owner purpose which is relevant to DMARC?DougOn Thu, Dec 8, 2022 at 7:56 AM Mark Alley <mark.alley=40tekmarc....@dmarc.ietf.org> wrote:Adding clarification since I forgot to specify - this would be per-sender per-source. Not a set percentage of all mail received from a source, that obviously would not work as intended. On 12/8/2022 6:52 AM, Mark Alley wrote:This may have been thought of before, so forgive the potentially duplicate idea, I was musing earlier about feedback reporting based on a percent of the overall mail per-source. I'm thinking of something similar in concept to the pct= tag for published policy. This would reduce the overhead required to report from particular sources... But as I'm typing this idea out, this seems less than feasible due to the other considerations that come to mind; If a receiver designed to report only on 10% of mail received from a source, was sent 100 emails from said source, and the 80 of those emails of mail were forwards, the feedback would be overwhelmingly biased towards forwarding data, and the sender would miss out on reports from direct senders and therefore fully compliant (and arguably more useful) reports. Evolving on this thought, if a receiver reported subset percentages of all different types of compliant/non-compliant email per-source (SPF fails/DKIM passes, SPF passes/DKIM fails... etc, etc.) this might provide the data needed while still keeping the reporting volume manageable for less internet-scale receivers. Though, it goes without saying, this type of reporting would be woefully inadequate in terms of data availability, and only gives an idea of traffic types seen, not inclusive of all-encompassing volumetric data that could be derived normally from feedback reporters that process all emails. On 12/8/2022 12:58 AM, Douglas Foster wrote:1) DMARC was a successful 2-company experiment, which was turned into a widely implemented informational RFP. We are now writing the standards-track version of that concept. We hope that Standards Track will provide the basis for significantly increased adoption. This seems the appropriate time to ask whether the design can be optimized for efficiency. If you were designing from scratch, would this reporting design be the result? What alternatives have we considered and ruled out? 2) The burden of reporting is not experienced equally by all report senders. If I send a batch of messages from 1 source domain to: - 10 target domains at Google, I will get 1 report, because Google consolidates across target domains. - 10 target domains at Yahoo, I will get 10 reports, because Yahoo chooses to disaggregate by target domain. - 10 target domains to Ironport clients, I will get 20 or 30 reports. These are client-specific appliances, many clients have multiple appliances configured in parallel for load balancing, and each appliance produces its own report. Google presumably can dedicate servers to the reporting function, while the Ironport servers seem to generate reports in parallel with message processing. Altogether, I conclude that Google can absorb an increase in workload much more easily than an appliance 3) The burden of reporting is not shared equally at present. Substantially all of my reporting comes from the three sources just stated: Google, Yahoo, and Ironport appliances. Since these organizations have not been actively participating, perhaps you are right and they are happy with the present design. On the other hand, perhaps someone with connections should ask them whether they want to see optimizations. 4) As DMARC participation grows, the growth curve is not really linear. Currently, 40% of my mailstream is covered by DMARC reporting because more than 30% of my outbound mail goes to Google servers. Altogether, the number of reporting domains, from all sources, is somewhere around 40. To move reporting from 40% of messages to 40% of domains, the volume of reports will grow by orders of magnitude. 5) Which then raises the question of, "Who do we expect to do reporting?" Several participants in this group have expressed the conviction that everyone who benefits from DMARC should also contribute to DMARC by doing reporting. This seems fair, but it is probably not necessary. Reporting from Google alone is probably sufficient for domain owners to know whether or not their servers are properly configured. But as long as we want everyone to participate, we cannot assume that everyone will have Google's resources to contribute to the reporting task. All of which says to me that we should be looking to optimize the reporting function to minimize the cost of participation. Doug Foster On Tue, Dec 6, 2022 at 10:15 PM Seth Blank <s...@sethblank.com> wrote: I'm super unclear what you're talking about. https://dmarc.org/2022/03/dmarc-policies-up-84-for-2021/ Aggregate reporting is used by the largest volume senders on earth, and the vast majority of mail received by mailbox providers comes with a dmarc record and reporting address attached. This is umpteen billions of messages a day that get aggregated into reports. What are you getting at? That seems pretty internet scale to me... Seth On Mon, Dec 5, 2022 at 2:01 PM Douglas Foster <dougfoster.emailstanda...@gmail.com> wrote: I began wondering if Aggregate Reporting works only because DMARC has been embraced by a small portion of domain owners. 1) Is Aggregate Reporting a significant portion of all mail? In some cases, Yes. My organization's data: Inbound volume is 11 times greater than my outbound volume. Inbound mail has 1 new domain for every 5 messages Net result: If I were to do reporting, and reporting became requested for most or all domains, my outbound mail volume would triple, because my outbound report volume would be twice as large as my outbound business mail volume. 2) Is Aggregate Reporting efficient? Restating previous concerns: "All Signature" reporting means: We keep evaluating even after successful authentication has been established, so that we can capture and store data of little actual value, even though it causes reduced aggregation and longer reports. "No Problems found, No changes found" reporting means: We send redundant reports day after day. "All Requesters" reporting means: We send reports even to domain owners that were blocked because of domain reputation. A good place to start would be to extend the reporting interval for no-problem-found reports. Doug Foster _______________________________________________ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc _______________________________________________ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc_______________________________________________ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc _______________________________________________ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc
_______________________________________________ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc