In theory, a message-id has the form <user-part@domain-part> [(comment)]

Unfortunately there are quite a few ids that don't match.

For example:

There are several on the cocoon lists around April 2004 with no trailing >

https://lists.apache.org/api/source.lua?id=c5ytton7pt6dvzbkvk9o9p21o7gnnd72

There are others with non-comment trailing text after the >
https://lists.apache.org/api/source.lua?id=8a82cf50c982673b8e2f5c03374c8d4668a54ccdbf103b469f63ddc9@%3Cbuilds.flink.apache.org%3E

Many of these are recent, for example:
https://lists.apache.org/api/source.lua?id=jovkcx55v2xz31rsvq56mflwdfr0fgdm
This has the header:
Message-ID: <[email protected]>+942C722077850D4E
However the mbox database index has dropped the text after the >

Whereas mod_mbox keeps it:
http://mail-archives.apache.org/mod_mbox/kafka-users/202111.mbox/%[email protected]%3e+942C722077850D4E

Whilst these may be invalid according to the RFCs, they are clearly
still being used, so the code should try to accommodate them. It looks
like the compat32 parser does handle these, however the default parser
does not.

Reply via email to