On Thu, May 15, 2025 at 09:18:53PM -0700, Will Yardley wrote:
> It's interesting that in replies, Mutt uses >From escaping, though in
> the actual raw mbox file, it's not present.
No, you're confusing two different uses of '>' at the beginning of a line.
...
Yes, I understand how quote markers look, but you'll notice that any
body (non-header) lines starting with a leading "From " are escaped with
a leading '>' (no space).
That's the second use. It's part of how mbox works.
And I realized that yes, it's present in the raw file -- in the context
it's supposed to be (lines _other_ than the "postmark line" or the From:
header lines).
Yes, if the raw file is an mbox file. Other mail storage mechanisms
don't need that.
I was just confused because the earlier post had seemed
to imply that this was not done / needed for MBOXCL2.
It's not, because that variant has Content-Length:.
Let's say you want to put multiple variable-length chunks of text in a
single file. When you want to read one of those chunks, how do you find
the end? It can only work if whatever wrote the file supplied extra
information. It can either put a marker after each chunk of text, or
put the length of each chunk before it.
Original mbox (mboxo) uses a marker. Well, not an end marker, it looks
for the beginning of the next message, which is identifed by the line
that mbox puts before each message, which looks like this:
From nobody@nowhere.invalid Thu Jan 1 00:00:00 1970
That's the mbox "From " line, or From_ line, or postmark line. Trouble
is, it's mostly variable. That email address varies, and it might not
even be an email address; it can be pretty much any string. The
date-time syntax varies. The only thing you can count on is that the
line starts with "From " (note the space character). So to find the end
of the message, you keep reading until you get a line that starts with
"From ", and then you back up a little.
But a message line can start with "From ", too. That would look like
the beginning of the next message, but it's not. So when the message
was written into the file, any message line that starts with "From "
got a '>' put in front of it, so it can't be mistaken for the mbox
postmark line before the next message. But now that message line is damaged.
Mboxcl2 does it the other way. Instead of using a marker, it uses a
length, stored in the non-standard header Content-Length:. Software
reads the message header section and in there finds the length of the
message body, and then just reads that many bytes, and then it has the
whole message. It doesn't need to scan for the next mbox From_ line, so
whatever wrote the file didn't have to change any message lines that
start with "From ", to make them start with ">From " instead.
If you think this is a mess, you're right. People have tried to fix it,
but it can't really be fixed. The only good solution is to not use
mbox, and store mail in other ways instead.
One other way is maildir, which puts each message in a separate file.
End of file is end of message, and no other information is needed. No
end marker, no length, no mbox-style From_ line, just the RFC 822
message itself, unmodified. Works fine.