Hi all, I noticed some multipart messages were being rendered incorrectly on the git mailing list archives I sometimes(*) run at https://public-inbox.org/git/ using Email::MIME.
I noticed this problem with Email::MIME 1.926 on Debian stable, but also reproduced it with 1.937 from unstable. The problem manifests when the part has no header. For example, the first rendered part of the message at is missing the first 3 lines of the body: https://public-inbox.org/git/20170110004031.57985-1-hans...@google.com/ shows the following (indented 4 spaces in this email): [-- Attachment #1: Type: text/plain, Size: 166 bytes --] Documentation/diff-config.txt | 5 ++++- Documentation/diff-options.txt | 3 ++- 2 files changed, 6 insertions(+), 2 deletions(-) -- 184.108.40.2060.gc69c2f50cf-goog However, closer inspection shows 3 lines missing off the top of the body from what I see in mutt, or from the "raw" mboxrd message: https://public-inbox.org/git/20170110004031.57985-1-hans...@google.com/raw The problematic part is the newline before "Richard Hansen (2):" line seems ignored as no headers follow the "--94eb2c0bc864b76ba30545b2bca9" boundary, and that fools Email::MIME. Here's the first part with boundary (indented 4 spaces, again) --94eb2c0bc864b76ba30545b2bca9 Richard Hansen (2): diff: document behavior of relative diff.orderFile diff: document the pattern format for diff.orderFile Documentation/diff-config.txt | 5 ++++- Documentation/diff-options.txt | 3 ++- 2 files changed, 6 insertions(+), 2 deletions(-) -- 220.127.116.110.gc69c2f50cf-goog So, there's no mail headers in the first part, but Email::MIME thinks part of the body contains headers! Attached is a standalone script which shows the problem independently. Reading the parts_multipart sub in Email::MIME, I can see there's quite a few quirks in there and I'm not comfortable messing with already-working cases (at least not at this hour). I checked RFC 1521 and it appears headerless part boundaries are allowed. But I'm actually not sure how Richard generates his emails and signatures. I don't think anybody else cryptographically signs patches on g...@vger.kernel.org, nor was it ever encouraged for Linux kernel development (and S/MIME signatures are huge, even compared to PGP). (*) In the likely case public-inbox.org is down, there are at least two Tor .onion mirrors in different locations using the same URL patterns: http://hjrcffqmbrq6wope.onion/git/<Message-ID>/ http://czquwvybam4bgbro.onion/git/<Message-ID>/
use strict; use warnings; use Email::MIME; my $msg = Email::MIME->new('From: Richard Hansen <hans...@google.com> To: g...@vger.kernel.org Cc: Richard Hansen <hans...@google.com> Subject: [PATCH 0/2] minor diff orderfile documentation improvements Date: Mon, 9 Jan 2017 19:40:29 -0500 Message-Id: <20170110004031.57985-1-hans...@google.com> X-Mailer: git-send-email 18.104.22.1680.gc69c2f50cf-goog Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg=sha-256; boundary="94eb2c0bc864b76ba30545b2bca9" --94eb2c0bc864b76ba30545b2bca9 Richard Hansen (2): diff: document behavior of relative diff.orderFile diff: document the pattern format for diff.orderFile Documentation/diff-config.txt | 5 ++++- Documentation/diff-options.txt | 3 ++- 2 files changed, 6 insertions(+), 2 deletions(-) -- 22.214.171.1240.gc69c2f50cf-goog --94eb2c0bc864b76ba30545b2bca9 Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: (truncated) S/MIME Cryptographic Signature dkTlB69771K2eXK4LcHSH/2LqX+VYa3K44vrx1ruzjXdNWzIpKBy0weFNiwnJCGofvCysM2RCSI1 --94eb2c0bc864b76ba30545b2bca9-- '); my @parts = $msg->parts; print "<HEADER>\n", $parts->header_obj->as_string, "\n</HEADER>\n"; print "<BODY>\n", $parts->body, "\n</BODY>\n";