GitHub user Humbedooh opened an issue:
https://github.com/apache/incubator-ponymail/issues/394
Archiving emails in violation of RFC-2821 line-endings may result in
multiple emails on a redundant setup
It would seem that when some (older) MTAs send out email, they do not
conform to RFC-2821 about newlines. From the RFC, it is stated that:
~~~
In addition, the appearance of "bare" "CR" or "LF" characters in text
(i.e., either without the other) has a long history of causing
problems in mail implementations and applications that use the mail
system as a tool. SMTP client implementations MUST NOT transmit
these characters except when they are intended as line terminators
and then MUST, as indicated above, transmit them only as a <CRLF>
sequence.
~~~
Case in point: qmail sometimes will send an email using only LF instead of
CRLF. This is then corrected to CRLF by postfix, but has the disadvantage in
clustered setups that one archiver may receive the original input while the
next gets the corrected one. The difference there is but a single added newline
character, but that is enough to cause two distinct IDs being generated.
Short of fixing all MTAs, the best solution seems to be detecting any STDIN
that ends in a double newline and, if found, crop the last one out before
archiving.
The fix seems to be as simple as (in archiver.py, line 580-ish):
~~~
if msgstring[-2:] == b'\n\n':
msgstring = msgstring[:-1]
~~~
I'll investigate further and implement a solution when I am satisfied this
will resolve the issue.
----
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---