Bug#507465: mb2md often misses message boundaries in mbox files
severity 507465 wishlist retitle 507465 mb2md: Please provide an option to tweak mail separation detection. thanks Putting back the submitter in the loop, please Cc submitters, or use the [EMAIL PROTECTED] alias… Bruno De Fraine [EMAIL PROTECTED] (03/12/2008): This script intentionally looks for a blank line in between messages in the mbox file. There is no such requirement that I know of; There *is* certainly mention of a blank line in the first few references that turn up when looking for an mbox file specification: http://www.qmail.org/man/man5/mbox.html http://en.wikipedia.org/wiki/Mbox Err, quoting qmail as a reference looks bogus to me… Anyway, looking at e.g. RFC 4155 (which points to qmail's site too, sigh, with typo, sigh again), we have: ,--[ Appendix A. The default mbox Database Format ]-- |The default mbox database format uses a linear sequence of Internet |messages, with each message being immediately prefaced by a separator |line, and being terminated by an empty line. More specifically: | |… | | o Each message in the database MUST be terminated by an empty | line, containing a single end-of-line marker. `-- So you're right. As is, I got 30% fewer messages in the new Maildir, and lots of messages were actually two or three messages run together. The result is so garbled I wonder if anyone else has ever used this script... Requesting an option to make the newline between two sucessive mails optional makes anyway sense to me, adjusting bug severity and title accordingly. Mraw, KiBi. signature.asc Description: Digital signature
Bug#507465: mb2md often misses message boundaries in mbox files
Bruno De Fraine [EMAIL PROTECTED] writes: mb2md has worked perfectly well for me with mbox files coming from Procmail and Dovecot. Are you certain your mailboxes are not ill- formatted to begin with? Yes, they do seem to be badly formed. My dud mail files were from thunderbird/icedove. Evidently its local mail storage is ad-hoc mbox-like non-mbox files. Sigh... Anyway, thanks, all, for the investigation. -- Grant Taylor http://www.picante.com/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#507465: mb2md often misses message boundaries in mbox files
Hello, This script intentionally looks for a blank line in between messages in the mbox file. There is no such requirement that I know of; There *is* certainly mention of a blank line in the first few references that turn up when looking for an mbox file specification: http://www.qmail.org/man/man5/mbox.html http://en.wikipedia.org/wiki/Mbox As is, I got 30% fewer messages in the new Maildir, and lots of messages were actually two or three messages run together. The result is so garbled I wonder if anyone else has ever used this script... mb2md has worked perfectly well for me with mbox files coming from Procmail and Dovecot. Are you certain your mailboxes are not ill- formatted to begin with? Bye, Bruno -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#507465: mb2md often misses message boundaries in mbox files
Package: mb2md Version: 3.20-2 Severity: grave Justification: renders package unusable This script intentionally looks for a blank line in between messages in the mbox file. There is no such requirement that I know of; certainly all the mbox data I have has tons of messages with no blank line before the '^From ' line. As is, I got 30% fewer messages in the new Maildir, and lots of messages were actually two or three messages run together. The result is so garbled I wonder if anyone else has ever used this script... This tweak seems to have worked for me: --- /usr/bin/mb2md 2005-07-04 16:38:47.0 -0400 +++ mb2md 2008-12-01 11:18:42.080177429 -0500 @@ -979,8 +979,6 @@ # The subject of the message my $subject = ''; - my $previous_line_was_empty = 1; - # We record the message start line here, for error # reporting. my $startline; @@ -1003,7 +1001,6 @@ $_ =~ s/\r\n$/\n/; if ( /^From / -$previous_line_was_empty (!defined $contentlength) ) { @@ -1419,8 +1416,6 @@ # End of the if statement dealing with message body. } - $previous_line_was_empty = ( $_ eq \n ); - # End of while (MBOX) loop. } # Close the input file. -- System Information: Debian Release: 4.0 APT prefers stable APT policy: (500, 'stable') Architecture: i386 (i686) Shell: /bin/sh linked to /bin/bash Kernel: Linux 2.6.25.10.habanero2 Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968) Versions of packages mb2md depends on: ii libtimedate-perl1.1600-5 Time and date functions for Perl ii perl [perl5]5.8.8-7etch3 Larry Wall's Practical Extraction mb2md recommends no packages. -- no debconf information -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]