Package: mhonarc Version: 2.6.19-2 Severity: normal Tags: patch upstream Dear Maintainer,
Consider the attached UUCP-style mbox, where for each message the
byte-length of its body is indicated with a Content-Length: header. The
‘-conlen’ [0] flag is meant to make MHonArc read the correct body length
and remove the need for unescaping lines starting with “From ”. This
usually works well [1], but these Content-Length: are ignored for
excluded messages, so bogus archives are generated when the body of an
excluded message matches the mbox separator.
I'd expect `mhonarc -conlen -expireage 86400 -checknoarchive` to create
an archive containing only the first and last messages: the 2nd has a
Message-Id collision with the 1st; the 3rd has ‘X-No-Archive’ set, and
the 4rd is too old. However
$ mhonarc -conlen -expireage 86400 -checknoarchive -outdir /tmp/out -
</tmp/test.mbox
This is MHonArc v2.6.19+, Perl 5.030003 linux
Converting messages to /tmp/out
Reading - ...
Warning: Could not parse date for message
Message-Id:
<[email protected]>
Date:
.....
Writing mail ...
Writing /tmp/out/maillist.html ...
Writing /tmp/out/threads.html ...
Writing database ...
3 new messages
3 total messages
$ grep X-Subject: /tmp/out/msg*.html
/tmp/out/msg00000.html:<!--X-Subject: foo -->
/tmp/out/msg00001.html:<!--X-Subject: -->
/tmp/out/msg00002.html:<!--X-Subject: baz -->
AFAIK this is because read_mail_header() doesn't return the headers for
excluded messages, so read_mail_body() doesn't have any Content-Length
value to skip and thus chokes on the “From ” in the message body. After
applying the attached patch the bogus message is no longer generated and
the archive is what one would expect:
$ mhonarc -conlen -expireage 86400 -checknoarchive -outdir /tmp/out -
</tmp/test.mbox
This is MHonArc v2.6.19+, Perl 5.030003 linux
Converting messages to /tmp/out
Reading - .....
Writing mail ..
Writing /tmp/out/maillist.html ...
Writing /tmp/out/threads.html ...
Writing database ...
2 new messages
2 total messages
$ grep X-Subject: /tmp/out/msg*.html
/tmp/out/msg00000.html:<!--X-Subject: foo -->
/tmp/out/msg00001.html:<!--X-Subject: baz -->
(‘-reconvert’ is a workaround for the Message-Id collision, however it's
not always ideal on open lists as it allows an attacker to DoS previous
messages to a list.)
Cheers,
--
Guilhem.
[0] https://www.mhonarc.org/MHonArc/doc/resources/conlen.html
[1] But see also https://bugs.debian.org/970209
test.mbox
Description: application/mbox
--- mhonarc-2.6.19/lib/mhamain.pl
+++ mhonarc-2.6.19/lib/mhamain.pl
@@ -788,14 +788,14 @@
grep { /no-external-archive/i } @{$fields->{'restrict'}}) ||
(defined($fields->{'x-no-archive'}) &&
grep { /yes/i } @{$fields->{'x-no-archive'}})) ) {
- return undef;
+ return (undef, $fields);
}
##----------------------------------##
## Check for user-defined exclusion ##
##----------------------------------##
if ($MsgExcFilter) {
- return undef if mhonarc::message_exclude($header);
+ return (undef, $fields) if mhonarc::message_exclude($header);
}
##------------##
@@ -833,7 +833,7 @@
delmsg($index);
$index = undef;
} else {
- return undef;
+ return (undef, $fields);
}
}
@@ -879,7 +879,7 @@
## Return if message too old to add (note, $index just contains time).
if (&expired_time($index)) {
- return undef;
+ return (undef, $fields);
}
##-------------##
@@ -950,7 +950,7 @@
## Invoke callback if defined
if (defined($CBMessageHeadRead) && defined(&$CBMessageHeadRead)) {
- return undef unless &$CBMessageHeadRead($fields, $header);
+ return (undef, $fields) unless &$CBMessageHeadRead($fields, $header);
}
$Time{$index} = $t;
signature.asc
Description: PGP signature

