Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-30 Thread Ken Hornstein
>Not qmail?  http://qmail.org/man/man5/mbox.html describes mboxrd by
>default, and then describes how qmail-local locks the file.

Dang.  Well, I see that the default is for qmail to use Maildir (not a
surprise), but I see that it unconditionally uses mboxrd to write a mbox
maildrop.

>> I don't believe that nmh ever munges a From line in an email body; I
>> cannot find any code that does so, but if there is please let me know.
>
>uip/dropsbr.c's mbx_copy() does, used by packf(1), rcvpack(1), and
>slocal(1).  I was in that file the other day, deleting all the
>`map'-index-file code.

I guess I'm 0 for 2 today.

This suggests to me that everything that reads and writes a mbox file
should have the option to read/write mboxrd, in addition to mboxo
format.  I do not believe it is possible to autodetect which format is
in use, but obviously I've been wrong a lot lately.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-30 Thread Ralph Corderoy
Hi Ken,

> But AFAICT all MTAs store their maildrops in mboxo format (at least
> the ones that use mbox format do).  I couldn't find one that used
> mboxrd format

Not qmail?  http://qmail.org/man/man5/mbox.html describes mboxrd by
default, and then describes how qmail-local locks the file.

> > I see nmh 1.6 munges From with `>'.
>
> I don't believe that nmh ever munges a From line in an email body; I
> cannot find any code that does so, but if there is please let me know.

uip/dropsbr.c's mbx_copy() does, used by packf(1), rcvpack(1), and
slocal(1).  I was in that file the other day, deleting all the
`map'-index-file code.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-29 Thread Ken Hornstein
>> I realize this wouldn't happen for likely a while, but would people be
>> happy with just mboxo and mboxrd support in terms of maildrop parsing?
>
>Those two seem sufficient, but the user would have to state which their
>system used?

Well, yeah?  I mean, I don't see how you could configure it automatically.

But, I found out some stuff later on that maybe rendered this moot.

There is a bit of confusion between mailBOXES and mailDROPS.  The former is
storage of existing mail; the latter is used by a MTA to store new email
where a MUA can retrieve it.

It seems like some MUAs use mboxrd for internal email storage.  This
mostly does not concern us (except maybe we'd want to have people run
"inc" on those fils).  But AFAICT all MTAs store their maildrops in
mboxo format (at least the ones that use mbox format do).  I couldn't
find one that used mboxrd format; if people know of one, please let me
know.  That suggests to me that implementing mboxrd format mostly isn't
worth it.

>I see nmh 1.6 munges From with `>'.  It could switch to Quoted-Printable
>and `=46rom'?

I don't believe that nmh ever munges a From line in an email body; I cannot
find any code that does so, but if there is please let me know.  I believe
nmh receives an already-munged From line and we're kind of stuck there.

As for automatically switching to q-p, well ... there was what I
can only describe as a ridiculous outcry when that happened in 1.6
(primarily due to the new auto-MIMEification and lines too long), to the
point where some people put stuff in their .mh_profile that they THOUGHT
would prevent that from happening, even though they hadn't upgraded
to 1.6!  Just the thought of having their outgoing email encoded in
q-p freaked them out for reasons I cannot explain.  If we added such a
feature we'd need a way to disable it, and I expect the people who needed
it the most would make sure they never used it.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-27 Thread Ralph Corderoy
Hi Ken,

> I realize this wouldn't happen for likely a while, but would people be
> happy with just mboxo and mboxrd support in terms of maildrop parsing?

Those two seem sufficient, but the user would have to state which their
system used?

> And does anyone see Content-Length headers in their maildrops anymore?

No, I think that only survived for any length of time in Usenet?  I see
Postfix's cleanup(8) deletes them.

Postfix now strips out Content-Length: headers in incoming mail to
avoid confusion in mail user agents.

formail(1) takes note of them.

I see nmh 1.6 munges From with `>'.  It could switch to Quoted-Printable
and `=46rom'?

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-25 Thread Jon Fairbairn
Ken Hornstein  writes:

>>No, it always was in band - the 4-SOH sequence was searched for in all
>>lines of the message, and SOH has always been a possible character in
>>e-mail.   Just even more unlikely years ago than it is now.
>
> You know, I _was_ going to disagree here but Robert is, as he almost
> always is, 100% correct.  4-SOH is not valid in an email HEADER
> (mostly), but it is certainly valid in a message BODY, and this goes all
> the way back to RFC 822.  There were some minor changes along the way
> (RFC 822 said NULs were valid, but RFC 2822 said they were not), but SOH
> has always been a valid character in email bodies; MIME didn't change
> this one bit.

Well, gosh. I stand corrected; I should have read RFC 822 before
making that decision (back whenever it was). I can only assume
that I had based it on what I thought was allowed in mail before
RFC822. If I had been designing SMTP I wouldn’t have allowed all
128 ASCII characters. The first 8 would have been forbidden, for
a start. Then we could have used ETX to mark the end of the
body, and not ., which can legitimately appear in a
text message. But I wasn’t, so they weren’t and we couldn’t. Oh
well.

-- 
Jón Fairbairn jon.fairba...@cl.cam.ac.uk


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-24 Thread Ralph Corderoy
Hi Ken,

> > msh has gone post 1.6.  Can inc's -pack die too
> > before 1.7?  It seems there's quite a bit of lingering msh rotting away.
> > :-)
>
> I say, "Oh, hell YES!"

Done that with 6170b76c.  I think I've jiggled test/inc/test-pop
correctly;  one of its -pack uses wasn't -pack related so needed to
remain, sans -pack.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-24 Thread Ken Hornstein
>inc(1) implies above that `inc -host pop3.example.com -pack spool.mmdf'
>is for msh(1) users.  msh has gone post 1.6.  Can inc's -pack die too
>before 1.7?  It seems there's quite a bit of lingering msh rotting away.
>:-)

I say, "Oh, hell YES!"

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-24 Thread Ralph Corderoy
Hi again,

More woes.

> I take this to be an MH-only file, unused by other parties.  Thus if
> nothing in nmh reads them then it doesn't have to bother to create and
> maintain them any more?  And so all the supporting code can also go
> pre-1.7?

The main user in creating map files is inc(1), but only when POP3 is
being used, and only then if -pack has been given.  Before Larry Hynes
deleted the interesting bit, inc(1) used to say

If inc uses POP, then the -pack file switch is considered.  If
given, then inc simply uses the POP to packf the user's maildrop
from the POP service host to the named file.  This switch is
provided for those users who prefer to use msh to read their
maildrops.

The code duplicates the packf code that's elsewhere, always assumes
packf's -mmdf, so no mbox option, and has the ^A^A^A^A munging but
without the buffer-boundary bug because it stdios by line.

Untwining inc's map code whilst leaving -pack seems awkward, e.g. it may
be appending to an existing pack file and so looks up, using the map
file, how many messages are in it first so scan() can give the following
message numbers.

inc(1) implies above that `inc -host pop3.example.com -pack spool.mmdf'
is for msh(1) users.  msh has gone post 1.6.  Can inc's -pack die too
before 1.7?  It seems there's quite a bit of lingering msh rotting away.
:-)

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-24 Thread Ralph Corderoy
Hi,

I asked:
> Does anyone know about the `map' file that is built by packf?
...
> I don't think MH uses them so they're probably leftovers to feed to
> the swine.

I suspect the last user was msh(1), removed by
1.6-branchpoint-16-ge6917522, i.e. after 1.6's release.

A `map' file is an array of `struct drop'.  I found this comment.

 * A file which is formatted like a maildrop may have a
 * corresponding map file which is an index to the bounds of each
 * message.  The first record of such an map is special, it
 * contains:
 *
 *  d_id= number of messages in file
 *  d_size  = version number of map
 *  d_start = last message read
 *  d_stop  = size of file
 *
 *  Each record after that contains:
 *
 *  d_id= BBoard-ID: of message, or similar info
 *  d_size  = size of message in ARPA Internet octets (\n == 2 octets)
 *  d_start = starting position of message in file
 *  d_stop  = stopping position of message in file
 *  
 * Note that d_start/d_stop do NOT include the message delimiters,
 * so programs using the map can simply fseek to d_start and keep
 * reading until the position is at d_stop.

I take this to be an MH-only file, unused by other parties.  Thus if
nothing in nmh reads them then it doesn't have to bother to create and
maintain them any more?  And so all the supporting code can also go
pre-1.7?

With luck, that code won't have coverage testing so it will also boost
our coverage percentage.  :-)

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-24 Thread Ken Hornstein
>Well, using RFC 4155 parse rules for From_ lines can lead to
>different message boundary detection.

Sigh.  I don't view that RFC as particularly relevant now, as it has
some serious problems (like only 7-bit data, for one).  Also, it explicitly
says there is no escaping mechanism; I guess the idea was to define other
formats, but that never happened.

>I removed that from the MUA i maintain, because you never know for
>sure: for that you would need to scan the entire message first,
>check if several From_ lines occur and have been quoted alike,
>before you start to remove what you think is a superficial From_
>quote.

It seems to me that the safest bet would be to default to mboxo, and
if the user knew he was dealing with mboxrd they could add the appropriate
switch to do the dequoting.  That seems like a system parameter; either
your local MTA is doing mboxo or mboxrd, you shouldn't need to perform
any automatic detection.

>And then ezml i think it was (what unicode.org had before
>the switch to i think mailman) simply placed a space in the first
>column...  Any non truly-reversible change changes the original
>content, applying a MIME content-encoding is standardized, there
>you go.

I don't think that's relevant to nmh?  If mailing list software is changing
things en-route that's nothing we should be concerned with.

>mutt actively manages these header lines, at least.

Also, not relevant to nmh, I think.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-24 Thread Steffen Nurpmeso
Ken Hornstein  wrote:
 |>It seems it has been simply be carried along from original Unix
 |>mail storage, and never has been properly adjusted thereafter.
 |>POSIX also standardized this loose format which was in use since
 |>that beginning.  RFC 4155 defined a more proper format.
 |
 |I think there are two things that are being conflated here: the
 |various "standards" of the mbox format, and nmh's use of it.
 |
 |Since nmh doesn't use mbox as a mail store, things like RFC 4155 aren't
 |really relevant; we don't deal with mbox files except for two specific
 |tools.  So our goal here is to deal with existing mail DROPS (I use
 |that term to specify a place where external tools store mail where nmh
 |can read it).

Well, using RFC 4155 parse rules for From_ lines can lead to
different message boundary detection.

 |Of course, the more I dig into it the more fun I find.  For example:
 |
 | https://en.wikipedia.org/wiki/Mbox
 |
 |Which suggests that SOME mbox formats do perform reversible
 |From-munging.  Urrrk.

I removed that from the MUA i maintain, because you never know for
sure: for that you would need to scan the entire message first,
check if several From_ lines occur and have been quoted alike,
before you start to remove what you think is a superficial From_
quote.  And then ezml i think it was (what unicode.org had before
the switch to i think mailman) simply placed a space in the first
column...  Any non truly-reversible change changes the original
content, applying a MIME content-encoding is standardized, there
you go.

 |This suggests to me that a "next-gen" maildrop parser should be prepared
 |to handle what Wikipedia calls "mboxo" and "mboxrd" format.  A web page
 |linked to on that page suggests that on Linux Content-Length variants
 |(mboxcl and mboxcl2) are more common on Linux, but I am skeptical that
 |is true.

mutt actively manages these header lines, at least.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-24 Thread Ken Hornstein
>It seems it has been simply be carried along from original Unix
>mail storage, and never has been properly adjusted thereafter.
>POSIX also standardized this loose format which was in use since
>that beginning.  RFC 4155 defined a more proper format.

I think there are two things that are being conflated here: the
various "standards" of the mbox format, and nmh's use of it.

Since nmh doesn't use mbox as a mail store, things like RFC 4155 aren't
really relevant; we don't deal with mbox files except for two specific
tools.  So our goal here is to deal with existing mail DROPS (I use
that term to specify a place where external tools store mail where nmh
can read it).

Of course, the more I dig into it the more fun I find.  For example:

https://en.wikipedia.org/wiki/Mbox

Which suggests that SOME mbox formats do perform reversible
From-munging.  Urrrk.

This suggests to me that a "next-gen" maildrop parser should be prepared
to handle what Wikipedia calls "mboxo" and "mboxrd" format.  A web page
linked to on that page suggests that on Linux Content-Length variants
(mboxcl and mboxcl2) are more common on Linux, but I am skeptical that
is true.

I realize this wouldn't happen for likely a while, but would people be
happy with just mboxo and mboxrd support in terms of maildrop parsing?
And does anyone see Content-Length headers in their maildrops anymore?

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-24 Thread Ken Hornstein
>No, it always was in band - the 4-SOH sequence was searched for in all
>lines of the message, and SOH has always been a possible character in
>e-mail.   Just even more unlikely years ago than it is now.

You know, I _was_ going to disagree here but Robert is, as he almost
always is, 100% correct.  4-SOH is not valid in an email HEADER
(mostly), but it is certainly valid in a message BODY, and this goes all
the way back to RFC 822.  There were some minor changes along the way
(RFC 822 said NULs were valid, but RFC 2822 said they were not), but SOH
has always been a valid character in email bodies; MIME didn't change
this one bit.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-24 Thread Steffen Nurpmeso
Robert Elz  wrote:
  ...
 |If the mbox encoding format had been properly designed, rather than just
 |a "we need some way to fix the problem that a line starting 'From' in
 |a message acts like a separator" (which was a real issue/bug in early
 |implementations of it) this issue wouldn't arise.

It seems it has been simply be carried along from original Unix
mail storage, and never has been properly adjusted thereafter.
POSIX also standardized this loose format which was in use since
that beginning.  RFC 4155 defined a more proper format.

In the end MBOX is just a database format, the MUA i maintain uses
a MIME content-encoding to ensure the assertion of this format is
not contradicted, but we do not yet re-encode messages like that
when copying over, for that we have to perform proper From_
quoting as necessary, then.  This is a huge pile of cr.p.  Still,
using RFC 4155 rules for one lowers the possibilities for database
format clashes, and may give surprises due to different detection
of message boundaries, our upcoming version will warn when opening
mailboxes where this could be a problem.

 |There was one attempt to use (essentially, though as I recall, not quite
 |implemented that way) out of band data for mail collection files - that
 |is, adding a header and length field (much like tar format, or cpio, etc,

Jamie Zawinski of Netscape+ on that[1] is pretty famous:

  [1] https://www.jwz.org/doc/content-length.html

 |but much simpler) but which was a spectacular failure, and hated much more
 |than '>From' (which is mostly an annoyance, though it does screw integrity

My MUA strips those because it doesn't manage them, unless the
variable *keep-content-length* is set.  I think i have to obsolete
this, and at some future time either silently drop or fully
support them.  But they are redundant if messages are
content-encoded, and in times of digital signatures etc. you need
to make sure that user input is identical to output anyway, so
proper content-encoding is the right thing to do.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-24 Thread Ralph Corderoy
Hi,

kre wrote:
> No, it always was in band - the 4-SOH sequence was searched for in all
> lines of the message, and SOH has always been a possible character in
> e-mail.   Just even more unlikely years ago than it is now.

I have, AIX, recollections of one of the four SOH sometimes being munged
to STX.  I thought it was ^A^A^A^B.  I looked at packf(1) to see what it
does.  This being nmh, there's good news and bad news.

*Assuming* this hand-edited MH mail file can be created by MH,

$ cat -A 1
Return-Path: <>$
Subject: packf Test.$
$
foo$
^A^A^A^A$
bar$
$

then packf munges the in-band SOH.

$ packf -mmdf -file packf.mmdf +. 1
Create file ".../packf.mmdf"? y
$ cat -A packf.mmdf
^A^A^A^A$
Return-Path: <>$
Subject: packf Test.$
$
foo$
  → ^B^A^A^A$
bar$
^A^A^A^A$
$

I suspect other programs munge in other ways giving the ^A^A^A^B I
recall.

That's the good news.  The bad news is it doesn't cope with it being
split across a read-buffer boundary.

$ uniq -c 1 | cat -A
  1 Return-Path: <>$
  1 Subject: packf Test.$
  1 $
113 pad pad pad pad pad pad pad pad pad pad pad pad pad pad pad pad pad 
pad$
  1 pad pad pad$
  1 foo$
  1 ^A^A^A^A$
  1 bar$
$
$ uip/packf -mmdf -file packf.mmdf +. 1
Create file ".../packf.mmdf"? y
$ uniq -c packf.mmdf | cat -A
  1 ^A^A^A^A$
  1 Return-Path: <>$
  1 Subject: packf Test.$
  1 $
113 pad pad pad pad pad pad pad pad pad pad pad pad pad pad pad pad pad 
pad$
  1 pad pad pad$
  1 foo$
  →   1 ^A^A^A^A$
  1 bar$
  1 ^A^A^A^A$
$

packf's -mbox munging isn't all sweetness and light either.  If the
first line of the mail starts with `Return-Path:' or `X-Envelope-From:',
with exactly that case, then a From␣ header is built from its parts to
replace it.

Does anyone know about the `map' file that is built by packf?  The man
page briefly mentions it.

FILES
.msgbox.mapA binary index of the file

And I spy it was created with binary indexes.

$ hd .packf.map
  02 00 00 00 03 00 00 00  00 00 00 00 00 00 00 00  
||
0010  11 20 00 00 00 00 00 00  00 00 00 00 74 20 00 00  |. ..t 
..|
0020  05 00 00 00 00 00 00 00  03 20 00 00 00 00 00 00  |. 
..|
0030  00 00 00 00 05 00 00 00  08 20 00 00 00 00 00 00  |. 
..|
0040  0c 20 00 00 00 00 00 00   |. ..|
0048

I don't think MH uses them so they're probably leftovers to feed to the
swine.

It occurs to me that if readable and reversible archives of MH's 1, 2, 3
files are wanted that GNU shar(1) with -T would be an option.

BTW, in digging for this I found Usenet posts bemoaning the slowdown
between MH3 and MH6;  I suspect that's the code we've been deleting ever
since.  :-)

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-24 Thread Robert Elz
Date:Wed, 24 May 2017 09:50:29 +0100
From:Jon Fairbairn 
Message-ID:  


  | Back when I made the decision it was out of band.

No, it always was in band - the 4-SOH sequence was searched for in all
lines of the message, and SOH has always been a possible character in
e-mail.   Just even more unlikely years ago than it is now.

kre


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-24 Thread Jon Fairbairn
Robert Elz  writes:

> Date:Tue, 23 May 2017 09:35:52 +0100
> From:Jon Fairbairn 
> Message-ID:  
>
>   | One of the first things I learnt was that
>   | using in-band data as a separator is a bad idea,
>
> Like most generalizations, that is sometimes true, but sometimes not.
>
>   | so mmdf was obviously a more sensible format than mbox.
>
> Which doesn't follow at all, as MMDF is also using in-band data as
> a separator, just a different one.

Back when I made the decision it was out of band.

> I never really used MMDF format, so I am not sure of its details, but
> as I understand it, the only real issue with mbox format is that the
> conversion isn't invertible, that is, in that format, a line that used
> to start "From" is stored as ">From" (which exact lines that happens to
> depends upon implementation to some extent, but that doesn't matter),
> whereas an input like that started ">From" is also stored as ">From".
> When it comes time to extract the message, there's no way to invert that
> addition of the '>' as there's no way to determine whether the '>' was
> one that was added for this purpose or not.

Exactly, but if SOH*4 is now in band, it looks like I now have
to redesign my archive scripts.

-- 
Jón Fairbairn jon.fairba...@cl.cam.ac.uk


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-23 Thread Robert Elz
Date:Tue, 23 May 2017 20:23:54 -0400
From:Ken Hornstein 
Message-ID:  <20170524002355.1f76b6b...@pb-smtp1.pobox.com>

  | I remember that; it was the Content-Length header, right?

That is as I remember it (for what that is worth) yes.

  | It occurs to me that since MH messages are simply files, you can use the
  | standard set of Unix tools such as "tar" to archive them,

That's what I do, tgz or tbz files work just fine...

kre


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-23 Thread Ken Hornstein
>There was one attempt to use (essentially, though as I recall, not quite
>implemented that way) out of band data for mail collection files - that
>is, adding a header and length field (much like tar format, or cpio, etc,
>but much simpler) but which was a spectacular failure, and hated much more
>than '>From' (which is mostly an annoyance, though it does screw integrity
>checking).

I remember that; it was the Content-Length header, right?  From my dim
memory, it was really only a Solaris thing.  And whoo yeah, it was a
screaming failure.  From my memory it wasn't that people liked to edit
maildrops so much; the real problem was that a lot of software didn't
know about it and treated it like an mbox file (because it looked like
one) but things fell over when it encountered a stray "From".

It occurs to me that since MH messages are simply files, you can use the
standard set of Unix tools such as "tar" to archive them, and that will
work no matter what the message contents are because the metadata really
is stored out of band.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-23 Thread Robert Elz
Date:Tue, 23 May 2017 09:35:52 +0100
From:Jon Fairbairn 
Message-ID:  

  | One of the first things I learnt was that
  | using in-band data as a separator is a bad idea,

Like most generalizations, that is sometimes true, but sometimes not.

  | so mmdf was obviously a more sensible format than mbox.

Which doesn't follow at all, as MMDF is also using in-band data as
a separator, just a different one.

I never really used MMDF format, so I am not sure of its details, but
as I understand it, the only real issue with mbox format is that the
conversion isn't invertible, that is, in that format, a line that used
to start "From" is stored as ">From" (which exact lines that happens to
depends upon implementation to some extent, but that doesn't matter),
whereas an input like that started ">From" is also stored as ">From".
When it comes time to extract the message, there's no way to invert that
addition of the '>' as there's no way to determine whether the '>' was
one that was added for this purpose or not.

If the mbox encoding format had been properly designed, rather than just
a "we need some way to fix the problem that a line starting 'From' in
a message acts like a separator" (which was a real issue/bug in early
implementations of it) this issue wouldn't arise.

There was one attempt to use (essentially, though as I recall, not quite
implemented that way) out of band data for mail collection files - that
is, adding a header and length field (much like tar format, or cpio, etc,
but much simpler) but which was a spectacular failure, and hated much more
than '>From' (which is mostly an annoyance, though it does screw integrity
checking).  It failed as the messages were still text files, and people
like to edit text files - an in-band separator mechanism isn't affected
by that, but a length field of some kind, which is what an out of band
mechanism requires usually, is, and mangled e-mail archives were far too
frequent with that mechanism.

kre


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-23 Thread Ken Hornstein
>> I suppose that's a reason, but it just seems like mbox has been the
>> standard for approximately forever and MMDF is one of those weird relics
>> like UUCP that I only hear about once in a million years.
>
>Which is a shame. One of the first things I learnt was that
>using in-band data as a separator is a bad idea, so mmdf was
>obviously a more sensible format than mbox. The last time I saw
>a “>From” that should have been a “From” in a mail body was much
>more recent than it should have been.

It occurs to me that given the advent of MIME, \n^A^A^A^A\n is, as far as
I can tell, valid content for a 7bit or 8bit message part.  I'll admit that
it is unlikely (certainly far less likely than a \nFrom), but I don't think
it's as out-of-band as you think it is.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-23 Thread Ken Hornstein
>I used to come across it on sendmail-using AIX quite a lot IIRC.

AIX?  You are SO not selling me on this :-)

>> I had vague plans on writing a mail message parser using lex/yacc, and
>> I was NOT planning on putting MMDF support in it.
>
>Wouldn't it just affect the top level?  What's wrapped around an RFC
>5532 email is either nothing, /^From / and /^$/, or MMDF's "SOH×4\n"
>twice?

The problem is if we're getting rid of m_getfld() (my eventual goal),
the maildrop parsing happens in there.  It's not code I really want to
write.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-23 Thread Ralph Corderoy
Hi Ken,

> > Lack of From munging?
>
> I suppose that's a reason, but it just seems like mbox has been the
> standard for approximately forever and MMDF is one of those weird
> relics like UUCP that I only hear about once in a million years.

I used to come across it on sendmail-using AIX quite a lot IIRC.

> tin seems like it's a Usenet reader?  Am I wrong?

Correct.  It can save Usenet articles to mail spool files, including in
MMDF format.

> I had vague plans on writing a mail message parser using lex/yacc, and
> I was NOT planning on putting MMDF support in it.

Wouldn't it just affect the top level?  What's wrapped around an RFC
5532 email is either nothing, /^From / and /^$/, or MMDF's "SOH×4\n"
twice?

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-23 Thread Jon Fairbairn
Ken Hornstein  writes:

>>Lack of From munging?

That was indeed the reason that I chose it. The thing about
archives is that they are supposed to hang around for a long
time. I’ve been using some form of mh since the 1980s, and the
script that archives my mailboxes hasn’t changed much since
then.

> I suppose that's a reason, but it just seems like mbox has been the
> standard for approximately forever and MMDF is one of those weird relics
> like UUCP that I only hear about once in a million years.

Which is a shame. One of the first things I learnt was that
using in-band data as a separator is a bad idea, so mmdf was
obviously a more sensible format than mbox. The last time I saw
a “>From” that should have been a “From” in a mail body was much
more recent than it should have been.

>>> I think at this point MH/nmh is probably the only tool left that can
>>> deal with such things.

Well, it’s good that it still does.  As I said, archives.

-- 
Jón Fairbairn jon.fairba...@cl.cam.ac.uk
http://www.chaos.org.uk/~jf/Stuff-I-dont-want.html  (updated 2014-04-05)


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-22 Thread Ken Hornstein
>Lack of From munging?

I suppose that's a reason, but it just seems like mbox has been the
standard for approximately forever and MMDF is one of those weird relics
like UUCP that I only hear about once in a million years.

>> I think at this point MH/nmh is probably the only tool left that can
>> deal with such things.
>
>Python's mailbox in the standard library can read them.  Unfortunately,
>a GSoC in 2005 rewrote mailbox.py, breaking its support by sticking a
>From␣ header before the header section.  I've just opened
>http://bugs.python.org/issue30428

So it's been broken for ... 12 years and no one cared until now?  You're
not exactly selling me on this :-)

>Python's documention refers to
>http://www.tin.org/bin/man.cgi?section=5=mmdf so I guess tin(1)
>can.

tin seems like it's a Usenet reader?  Am I wrong?

The reason I bring this up is I had vague plans on writing a mail
message parser using lex/yacc, and I was NOT planning on putting MMDF
support in it.  Well, that would only be for parsing a maildrop; obviously
it wouldn't matter for parsing individual messages.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-22 Thread Ralph Corderoy
Hi Ken,

> > I use nmh and keep mmdf archives (produced by pack -mmdf). 
>
> I have to ask ... why?

Lack of From munging?

> I think at this point MH/nmh is probably the only tool left that can
> deal with such things.

Python's mailbox in the standard library can read them.  Unfortunately,
a GSoC in 2005 rewrote mailbox.py, breaking its support by sticking a
From␣ header before the header section.  I've just opened
http://bugs.python.org/issue30428

Python's documention refers to
http://www.tin.org/bin/man.cgi?section=5=mmdf so I guess tin(1)
can.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-22 Thread Ken Hornstein
>I use nmh and keep mmdf archives (produced by pack -mmdf). 

I have to ask ... why?  I think at this point MH/nmh is probably the only
tool left that can deal with such things.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-22 Thread Jon Fairbairn
Ken Hornstein  writes:

>>mts.conf(5) says these may be altered from their default.
>>
>>mmdelim1: \001\001\001\001\n
>>  The beginning-of-message delimiter for mail drops.
>>
>>mmdelim2: \001\001\001\001\n
>>  The end-of-message delimiter for mail drops.
>>
>>This doesn't seem useful, if it ever was.  If someone has an oddly
>>formatted bunch of emails in a file then they can csplit(1) or similar
>>before feeding the results to nmh.
>>
>>May it be removed from git now, or must it wait until 1.7 is released?
>
> I say get rid of it now; that's support for MMDF, right?  Maybe there
> are people out there still using MMDF, but I doubt they are still using
> nmh.

I use nmh and keep mmdf archives (produced by pack -mmdf). 

-- 
Jón Fairbairn jon.fairba...@cl.cam.ac.uk


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] Request Deprecation of mts.conf's mmdelim1 and mmdelim2.

2017-05-21 Thread Ken Hornstein
>mts.conf(5) says these may be altered from their default.
>
>mmdelim1: \001\001\001\001\n
>   The beginning-of-message delimiter for mail drops.
>
>mmdelim2: \001\001\001\001\n
>   The end-of-message delimiter for mail drops.
>
>This doesn't seem useful, if it ever was.  If someone has an oddly
>formatted bunch of emails in a file then they can csplit(1) or similar
>before feeding the results to nmh.
>
>May it be removed from git now, or must it wait until 1.7 is released?

I say get rid of it now; that's support for MMDF, right?  Maybe there
are people out there still using MMDF, but I doubt they are still using
nmh.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers