[Mailman-Developers] Re: Verifying broken DKIM signatures

Alessandro Vesely Mon, 28 Sep 2020 01:24:04 -0700

Hi Steve,

your observations put me on the right track.  Thank you so much!


Long post below:

On Thu 24/Sep/2020 12:31:29 +0200 Stephen J. Turnbull wrote:

Alessandro Vesely writes:

First, what Mailman are you talking about?  Only Mailman 3 is likely to get
these improvements, as Mailman 2 is end-of-life.  However, Mailman 2
installations are likely to be around in large numbers for several years,
and if Mailman 2 is any evidence, likely few Mailman 3 installations would
use these features unless forced to by a disaster like the Yahoo/AOL sudden
switch to DMARC p=reject.

Reversing transformation should work with any Mailman version, and with othermailing list managers as well. Hence, one cannot rely on some preciseindications, but rather on common MLM behavior.

Yet, it is possible to undo the transformation that Mailman put in place,
thereby validating the original DKIM signature. >

It would always be possible to undo all transformations by supplying the
original email as part of a multipart/alternative, or perhaps a new
multipart subtype, maybe with some kind of device to make reading the
message/rfc822 original difficult in standard MUAs.

Comparing to the original can make it more difficult to check that thetransformed version does not deviate from the original in unacceptable ways.While limiting the cases where original signatures can be recovered, a set ofaccepted transformations also limits the attack surface.

(In the case of Microsoft MUAs, if Mailman is configured to strip HTML, the
result might be less than 10% bigger than the original! ;-)


Even if plain text is safer, stripping HTML is irreversible.

It is sometimes possible to reverse transformations with only theinformation in the post after Mailman processing. However, some verydesirable changes are destructive (eg, anonymized lists, conversion of HTML
to plain text, removal of prohibitive attachments).  Some non-destructive
changes (headers and footers) are highly customizable. So the question is
what are the transformations that users want to reverse, and whether that's
really possible.

It has to be the responsibility of the list owner to configure Mailman so thatthe transformations can be reversed. Some options, like anonymized lists, areclearly at odds with the need to recover the original author domain's signature.

This kind of transformation reversal probably requires no changes toMailman, just an addition of a Handler which could be written independently
and "dropped in" (with a configuration change to the default pipeline).

On the other hand, a new header field can be abused. List-*, for example, areoften found in a number of messages which don't come from mailing lists, butjust aim at not being classified as UCE.

The necessary information about transformations that are configured would be
available from Mailman in the usual way (existing Handlers need that
information).


There could be more info than just the confguration, see the heuristic below.

Mailman carries out some irreversible changes, such as rewriting
To: or Cc: changing the order of the mailboxes,


Does this happen outside of DMARC mitigation?  Can you show examples?

I checked a few messages and couldn't find a switched To:. Switched Cc: seemsto happen when one of the recipients is the list itself, which is then moved tothe last place. (I try to reproduce that behavior with this message.)

or rewriting Content-Transfer-Encoding: irrespective of quotation marks
and case (for example "7bit" even if the original, signed field was
spelled as "7Bit"). >

I'm not aware of such behavior *unless other modifications were done*. In
that case, Mailman is specifying the C-T-E it uses, it is not rewriting the
original C-T-E.

I don't know if it's Mailman or a DKIM signing tool running afterwards, butmany plain text messages from mailing lists come rendered as base64. Since thefooter is part of the only MIME entity, the reversal has to decode base 64,remove the footer, and re-encode as base64 if that was the original C-T-E. Inthe latter case, one needs to know the column width of the original encoding...

I guess this behavior is coded deeply in Python libraries,


I don't think so.  As far as I know, the email module in Python 3 provides
some support for parsing header fields but I don't know why this would
change order or spelling of field contents.

Aha, you're right. It is probably Mailman writing its own C-T-E, which happensto be the same as the original, albeit spelt differently.

I would guess that to the extent that it happens it has to do withMailman-level processing (for example, collecting addresses from the samedomain so they can be presented as multiple RCPT TO with a single DATA).

No, the MTA doesn't care about header addresses. Domain optimization has to bedone after MX lookup.

I can say for sure that some care was taken to ensure that the order ofheader fields, including multiple instances of the same field is carefully
preserved.

Good news. In any case, DKIM header canonicalization has to be "relaxed",because fields are re-wrapped.

but would like to know developers' opinions.  Is that something that could
be fixed? >

First, the issues with headers could be improved, though not entirely fixed,
in DKIM itself by further canonicalizing structured headers before signing
or verifying. >

I'm not saying that this is the right way forward, but it should beconsidered.

There have been various proposals about a MIME-compatible DKIM. It's not goingto happen any time soon. There's not enough traction.

The second question is about producing a hint to the verifier telling
which transformation(s) have been applied to the message.  That would come
as an additional header field, for example: >>
     DKIM-Transform: footer


This could be done easily, but it would be at best a hint.  Among other
things, it might be desirable to identify the agent that performs the
transformation, as well as the algorithm and perhaps the host and/or the
list.  Mailman adds footers in different ways, specifically appending text
and adding a MIME part.  Third party patches are available that dig into
HTML structure (at least for Mailman 2).

DKIM-Transform would ease the reversing filter's job greatly. I wrote myprototype relaying on it, then I wrote a bash script to add the hint. Thescript was easy only because it /knew/ that a footer was added one way or theother.

There are lists that feed into lists, and apply their own transformations.

IMHO, transformation reversing must not be stackable. That is, no attempt torecover a middle mediator's signature. If multiple footers or multiple subjecttags are added, reversibility is lost.

or as an extra tag in a DKIM signature, for example:

     DKIM-Signature: v=1; (...) tf=footer; (...)


Not possible without a lot of effort and specific cooperation from MTAs.
Mailman doesn't DKIM sign messages, really doesn't want to (there are Python
modules for this, but use and configuration would be our responsibility so
we'd like to have specialists do it), and probably shouldn't (we're not
specialists) -- that should be left to the border MTA of the administrative
domain.


Thank you for confirming that.

That hint could spare the verifier one pass over the message.  Is it
something  that could be implemented?  If not, I'd try guessing, according
to this scheme: >

You're going to have to guess a lot for a long time anyway, because very few
installations will implement this header.  It's not obvious to me that
guessing won't be nearly as accurate as the header might be.


That's a key observation!

outermost Content-Type: |  first entity Content-Type: |  transformation |
------------------------+-----------------------------+-----------------+
text/plain              |   any                       |  footer         |
------------------------+-----------------------------+-----------------+
multipart/mixed         |   multipart/mixed           |  add-part       |
------------------------+-----------------------------+-----------------+
multipart/mixed         +   any other                 |  mime-wrap      |
------------------------+-----------------------------+-----------------+
any other               |   any                       |  non-reversible |
------------------------+-----------------------------+-----------------+

Does that look correct?


Not 100%.  I'm not sure what you mean by "mime-wrap", but if it's
Mailman's "Wrap Message" DMARC mitigation, as far as I know nobody
uses it.



No, those are in the initial set of transformations in the draft i cited.

*mime-wrap* is when the original message was, say, multipart/alternative, as inan HTML message with plain text equivalent. In that case, Mailman creates anew overall part with two entities. The first is the original body of themessage, the second the added footer.

*add-part* is when the original message was already multipart/mixed, as in thecase of an attachment. Mailman keeps the existing structure and adds a part atthe bottom, with the footer.

The mime-wrap case is so easy to reverse that it took me a good deal of time torealize that I cannot take advantage of it.

So, the kind of transformation says a bit more than the list configuration.How a footer is added depends on the message at hand.

I suspect that pretty much any multipart/mixed may have an added part
containing a footer, but it might not.


Here's the *heuristic* I came up with (not yet implemented):

First of all, find out a purported author domain, in any of From:,Original-From:, or Reply-To:. Check if that domain differs from the one in theSender:. Then check if there is a failed DKIM signature by the purportedauthor domain. In that case, try and reverse the transformation.

For the header, if the From: was rewritten, put it back. If the Subject:starts with a bracket, remove it. If any Original-* (or whatever) are present,replace them.

For the body, look for a line consisting of underscores or dashes. Check it isin a text/plain MIME entity, either its own entity or the whole body. Checkthat line is not followed by more than, say, 10 lines of text at the end of themessage. If found, remove it as per the table above.

I think that would work for most of the mailing lists I'm subscribed to, for"mild" DKIM signers. "Hard" DKIM signers, e.g. those who sign Sender:, willhave to adjust Original-* fields by trial and error.

At this point, I'm not so sure that specifying a DKIM-Transform: header fieldto ease transformation reversal is a good idea. Obviously, it would apply tofuture transformation only. In the future we'll have more powerful machines,so if the only advantage is to gain some efficiency, it becomes questionable.

Currently, there are mailing lists which don't do any change, not evensubject tags, in order to avoid breaking DKIM signatures. A somewhatProcrustean solution. >
It's the ONLY guaranteed solution, though, because avoiding rewriting is
only possible if you *know* that you're distributing modified posts only to
sites participating in your reversible modifications protocol (or ignore
DMARC p=reject).


Exactly.

I don't think From: rewriting is going to be disabled any time soon. >

You're right.  You need universal deployment of reverse transformation to
make disabling rewriting palatable. >

Reply-To: usually comes after From:, thereby requiring to go back to
change already parsed fields. >

That's not a problem, since DKIM requires reordering fields anyway. The
expensive part is not fiddling with the header, it's multiple passes of the
signature algorithm.


Good point!

Knowing the kind of transformation beforehand can save one pass through themessage.

As an alternative, I'd provide for yet another field to be put near the
top of the header. >

It's not an alternative.  The changes to Reply-To or Cc are *necessary* (in
the opinion of the list admin, not Mailman) to preserve the ability of the
recipient MTA to respond to author.

The goal is different. Transformation reversal recognizes the originalsignature, thereby affecting the aggregate reports that senders receive.According to DMARK marketing, the latter might influence their decision toswitch to a strict DMARC policy.

As a side effect, when the MDA sees that an Original-From: was authenticated,it can restore it in place of From:, after any external forwarding but beforestoring the message. That way, users recover the ability to reply privately tothe author.

Original-From:, say. This may seem redundant, however it serves adifferent goal. In addition, if the Original-From: is put in place by the
original signer, it ratifies its knowledge that From: will be rewritten
and its willingness to recover it afterwards. >
Could work, but addition of Original-From should be done by DMARCoriginators, not by Mailman.


Yes.

The name should probably be DMARC-Original-From, as well.

Yeah, or DKIM-Original-From, or whatever. It's important that it be new. Isometimes see X-Original-From:, which the new field shouldn't be conflatedwith. (Or should it?)

Note that, as mentioned above, if the author domain encodes a plain textmessage in base64, it should also add something like:


    Original-Content-Transfer-Encodig: base64; column-width=76

Is this endeavor completely useless, given that the current settings work
well enough?  Or could it help keeping a consistent DMARC semantics among
participants yearning to do so?  I'd be glad to hear your opinions... >
I don't think it's useless, but I don't see any reason for Mailman toparticipate until there's a (1) specification of transformations thatpeople want to be reversible, or (2) specific defects that if fixed, or (3)features that if added, would enable reversibility. >For (1), we would just guarantee a particular recognizable format fortransformations that should be reversible, and (2) and (3) would beaddressed as usual. >As mentioned, the hinting function can be done well-enough by auser-supplied Handler that looks up the list's configuration, determines the
transformations that are applied, and inserting the hints in the appropriate
place.

The draft I cited provided for a IANA registry containing the set of reversibletransformation accepted. I'll try and propose an alternative specification,based on the above heuristic (when I'll have implemented it). The question ofa DKIM-Transform: header field and a registry of its possible content shouldprobably be discussed at that time.

One case I can think of is a list with an HTML footer. It may use an <hr>instead of many underscores. How would that heuristically be discovered?Adding an entry in a registry would be a way to add new possibilities. On theother hand, HTML presents a wider attack surface. Section 8.2 of DKIM stilldeprecates using l=, as it can be used for "exploiting lax HTML parsing in theMUA" in order for "the appended content to completely replace the originalcontent in the end recipient's eyes".

Finally, contrary to what we all wish were true, this is not really a choice
for mailing lists.  It's a choice for recipient ad-dom border MTAs.  If they
don't buy in in large numbers, I'm not particularly interested in doing the
work.  I don't see why they would.

IME, MTAs are much more interested in DKIM signing than verifying. Anyway,sooner or later I'll release my filter with reverting capabilities. Albeit itis not widely used, a site can at least recognize their own signatures whenmessages come back from a list.

Most lists I participate in do things like strip large attachments and strip
prohibited executables.  I think those are very common in general. Most
lists I participate in also strip text/html alternatives and many convert
text/html to text/plain.  If that's the common case, why would a postmaster
bother (unless they're a DMARC purist, of course, which may be a good thing
but I don't think there are very many of them)?


That is the list's choice.

If the postmasters aren't bothering, why would list admins?  If the list
admins aren't bothering, why should Mailman?

Perhaps it's me, but I feel a bit of reluctance in From: rewriting. It /has/to be done, so savvy list admins do it. However, nobody seems to like it. Ifthat feeling is correct, a little bit of adoption should show up...

On the other hand, I do think that Mailman can and should enable
better analytics on posts by ensuring that we only change parts of a
message that we intentionally change.  I guess the might include
situations like the case where Mailman changes a MIME part,
reassembles the whole message, and assigns a C-T-E that happens to
have the same semantics as the original C-T-E.



Those changes are benign.

I encourage you (or anyone in the reversible transformation effort) to
report inadvertant changes as bugs, and to suggest candidates for
"standard formats" we could adopt to make ad hoc reversals more
reliable (eg, the list name in a Subject tag should be enclosed in
square brackets and match the last component of the List-ID -- not
clear how that works with internationalized lists though).

We need to experiment. For example, author domains may want to try adding anOriginal-Subject:.

I can't speak for other developers at this point, so I can't promise
any proposals would be implemented, but I'm certainly interested and
in some cases would definitely be an "advocate on the inside".



Thank you for your commitment.


Best
Ale
--























_______________________________________________
Mailman-Developers mailing list -- mailman-developers@python.org
To unsubscribe send an email to mailman-developers-le...@python.org
https://mail.python.org/mailman3/lists/mailman-developers.python.org/
Mailman FAQ: https://wiki.list.org/x/AgA3

Security Policy: https://wiki.list.org/x/QIA9

[Mailman-Developers] Re: Verifying broken DKIM signatures

Reply via email to