Re: Legal values for a message-id, and references header

Cameron Simpson Sat, 19 Nov 2016 23:53:54 -0800

On 19Nov2016 19:58, Kevin J. McCarthy <ke...@8t8.us> wrote:

On Sun, Nov 20, 2016 at 10:08:13AM +1100, Cameron Simpson wrote:

On 19Nov2016 13:13, Kevin J. McCarthy <ke...@8t8.us> wrote:
> Should mutt be rfc-2047 encoding/decoding the references
> header?


No. RFC2047 tokens need to be whitespace delimited from the surrounding
text.  No whitespace is permitted inside the "<" and ">" markers which
enclose a message-id:


Thank you for your detailed analysis, Cameron.  I will take a deeper
look at this soon.  Another piece of information is that they sent a
reply through the Fastmail web interface, which sent this:

 References: =?utf-8?Q?=3C201611170549=2EQ3WT?=
   =?utf-8?Q?foMB=C3=83=C2=BEngguang=2Ewu=40i?=
   =?utf-8?Q?ntel=2Ecom=3E=20=3C1479410?= =?utf-8?Q?777-6702-1-git-sen?=
   =?utf-8?Q?d-email-manuel=2Esch?= =?utf-8?Q?oelling=40gmx=2Ede=3E?=

 <https://gist.github.com/andrey-utkin/c9cf4f2dc282cf257a2552b06ede49d5>

If this is legal, then mutt needs to be decoding the References before
trying to parse out the ids, because I believe it will just choke on
this.


Wow. I would have thought that was illegal.

Regarding the discussion below, the TL;DR is that I think that if it isfeasible mutt should decode these, but write _unencoded_ versions of theseheaders and any headers derived from them. In particular, is it easy to makemutt's header ingestion code go "stict parse, but if that fails decode withRFC2047 and try a second time"? Probably on a specific header basis.


Regarding the standards:

RFC2047 doesn't actually enumerate specific headers, but second 5 has a list ofpermitted and forbidden places for "encoded-words" (which the above are). I'mgoing to quote the bits I think are pertinent but please read it to see if I'mmissing anything:

An 'encoded-word' may appear in a message header or body part header accordingto the following rules:

(1) An 'encoded-word' may replace a 'text' token (as defined by RFC 822) inany Subject or Comments header field, any extension message header field, orany MIME body part field for which the field body is defined as '*text'. An'encoded-word' may also appear in any user-defined ("X-") message or body partheader field.

Message-IDs are not "text" in RFC822 and its modern form RFC5322. So I'd say(1) does not permit this. A 'text' token is defined as:


  text            =   %d1-9 /            ; Characters excluding CR
                      %d11 /             ;  and LF
                      %d12 /
                      %d14-127

(1) _does_ say "any MIME body part field for which the field body is defined as'*text'". But '*text' means zero or more 'text' tokens, and Message-ID: et alare not MIME fields.

(2) An 'encoded-word' may appear within a 'comment' delimited by "(" and ")",i.e., wherever a 'ctext' is allowed. More precisely, the RFC 822 ABNFdefinition for 'comment' is amended as follows:


 comment = "(" *(ctext / quoted-pair / comment / encoded-word) ")"

This doesn't cover Message-IDs.

(3) As a replacement for a 'word' entity within a 'phrase', for example, onethat precedes an address in a From, To, or Cc header. The ABNF definition for'phrase' from RFC 822 thus becomes:


 phrase = 1*( encoded-word / word )

But a 'phrase' is just one of more 'word's and Message-IDs are not 'word's. TheRFC2047 goes on to say that _any_ other use is forbidden, and tries to bereally clear about that:

These are the ONLY locations where an 'encoded-word' may appear. Inparticular:


+ An 'encoded-word' MUST NOT appear in any portion of an 'addr-spec'.

+ An 'encoded-word' MUST NOT appear within a 'quoted-string'.

+ An 'encoded-word' MUST NOT be used in a Received header field.

+ An 'encoded-word' MUST NOT be used in parameter of a MIME Content-Type orContent-Disposition field, or in any structured field body except within a'comment' or 'phrase'.

So I think fastmail are playing fast and loose, and while mutt should try tocope, it sure as hell should never _emit_ this nonsense!


Cheers,
Cameron Simpson <c...@zip.com.au>

Re: Legal values for a message-id, and references header

Reply via email to