Bug#507465: mb2md often misses message boundaries in mbox files

2008-12-07 Thread Cyril Brulebois
severity 507465 wishlist
retitle 507465 mb2md: Please provide an option to tweak mail separation 
detection.
thanks

Putting back the submitter in the loop, please Cc submitters, or use the
[EMAIL PROTECTED] alias…

Bruno De Fraine [EMAIL PROTECTED] (03/12/2008):
 This script intentionally looks for a blank line in between messages
 in the mbox file.  There is no such requirement that I know of;

 There *is* certainly mention of a blank line in the first few
 references that turn up when looking for an mbox file specification:

 http://www.qmail.org/man/man5/mbox.html
 http://en.wikipedia.org/wiki/Mbox

Err, quoting qmail as a reference looks bogus to me… Anyway, looking at
e.g. RFC 4155 (which points to qmail's site too, sigh, with typo, sigh
again), we have:

,--[ Appendix A. The default mbox Database Format ]--
|The default mbox database format uses a linear sequence of Internet
|messages, with each message being immediately prefaced by a separator
|line, and being terminated by an empty line.  More specifically:
| 
|…
| 
|   o Each message in the database MUST be terminated by an empty
| line, containing a single end-of-line marker.
`--

So you're right.

  As is, I got 30% fewer messages in the new Maildir, and lots of
  messages were actually two or three messages run together.  The
  result is so garbled I wonder if anyone else has ever used this
  script...

Requesting an option to make the newline between two sucessive mails
optional makes anyway sense to me, adjusting bug severity and title
accordingly.

Mraw,
KiBi.


signature.asc
Description: Digital signature


Bug#507465: mb2md often misses message boundaries in mbox files

2008-12-07 Thread Grant Taylor

Bruno De Fraine [EMAIL PROTECTED] writes:


mb2md has worked perfectly well for me with mbox files coming from
Procmail and Dovecot. Are you certain your mailboxes are not ill-
formatted to begin with?


Yes, they do seem to be badly formed.  My dud mail files were from 
thunderbird/icedove.  Evidently its local mail storage is ad-hoc mbox-like 
non-mbox files.  Sigh...


Anyway, thanks, all, for the investigation.

--
Grant Taylor
http://www.picante.com/



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#507465: mb2md often misses message boundaries in mbox files

2008-12-03 Thread Bruno De Fraine

Hello,


This script intentionally looks for a blank line in between messages
in the mbox file.  There is no such requirement that I know of;


There *is* certainly mention of a blank line in the first few  
references that turn up when looking for an mbox file specification:


http://www.qmail.org/man/man5/mbox.html
http://en.wikipedia.org/wiki/Mbox

As is, I got 30% fewer messages in the new Maildir, and lots of
messages were actually two or three messages run together.  The result
is so garbled I wonder if anyone else has ever used this script...


mb2md has worked perfectly well for me with mbox files coming from  
Procmail and Dovecot. Are you certain your mailboxes are not ill- 
formatted to begin with?


Bye,
Bruno



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#507465: mb2md often misses message boundaries in mbox files

2008-12-01 Thread Grant Taylor
Package: mb2md
Version: 3.20-2
Severity: grave
Justification: renders package unusable

This script intentionally looks for a blank line in between messages
in the mbox file.  There is no such requirement that I know of;
certainly all the mbox data I have has tons of messages with no blank
line before the '^From ' line.

As is, I got 30% fewer messages in the new Maildir, and lots of
messages were actually two or three messages run together.  The result
is so garbled I wonder if anyone else has ever used this script...

This tweak seems to have worked for me:

--- /usr/bin/mb2md  2005-07-04 16:38:47.0 -0400
+++ mb2md   2008-12-01 11:18:42.080177429 -0500
@@ -979,8 +979,6 @@
# The subject of the message
my $subject = '';
 
-   my $previous_line_was_empty = 1;
-
 # We record the message start line here, for error
 # reporting.
 my $startline;
@@ -1003,7 +1001,6 @@
 $_ =~ s/\r\n$/\n/;
 
 if ( /^From /
-$previous_line_was_empty
 (!defined $contentlength) 
   )
 {
@@ -1419,8 +1416,6 @@
 # End of the if statement dealing with message 
body.
 }
 
-   $previous_line_was_empty = ( $_ eq \n );
-
 # End of while (MBOX) loop.
 }
 # Close the input file.


-- System Information:
Debian Release: 4.0
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.25.10.habanero2
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)

Versions of packages mb2md depends on:
ii  libtimedate-perl1.1600-5 Time and date functions for Perl
ii  perl [perl5]5.8.8-7etch3 Larry Wall's Practical Extraction 

mb2md recommends no packages.

-- no debconf information



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]