[My apologies if you get this message twice.  I accidentally sent an
earlier copy from an address different from the one I'm subscribed under]


All-

We've been using MBX as our default mailbox format for a couple years,
and have been relatively happy with it (though we're excited by MIX and
will likely move to it next summer).  We had been using 2004g for quite
a while, and although we get the occasional corrupt INBOX, we've almost
always been able to pin it down to a problem on the host, or with our SAN
etc.  When we have seen corrupt MBX INBOXes in the past, it's almost
always been the "Unable to find CRLF at" variety, and I wrote some tools
a while back that automates most of fixing that.

Our version of 2004g was compiled to use CREATEPROTO=mbxproto, and we're
using dmail (invoked via procmail) for delivery.

On Sunday, December 24th, I upgraded several (but not all) of our IMAP
servers to 2006d, also compiled with CREATEPROTO=mbxproto.  Nothing else
was changed.

Since then, we've seen a substantial increase in the number of corrupt
MBX INBOXes on the hosts that I upgraded to 2006d.  The hosts that are
still running 2004g haven't seen any increase at all.

In addition, the corruption has mainly been a variety that I was aware
of but that we haven't seen before:

  Last message (at 109914528) runs past end of file (129188466 > 109916160)

Since we're on semester break, many of our customers are running into
quota issues and by this point most of them that are over quota
have gone past the quota grace period.  AFAICT, every single instance of
MBX corruption with the new 2006d dmail/tmail is for a person that was
probably over quota when the corruption happened.

Here's an example.  One of our students, "studentx", had this when I
checked his INBOX:

sh-2.05$ /usr/local/sbin/mailutil check INBOX
Last message (at 124944185) runs past end of file (124948983 > 124944384)

Splitting INBOX at the corruption point, the head is good, and the tail
looks like this:

24-Dec-2006 10:47:47 -0600,4743;000000000000-000058ed
Received: via dmail-2006d.13 for studentx; Sun, 24 Dec 2006 10:47:47 -0600
(CST)
Return-Path: <[EMAIL PROTECTED]>
Received: from vaccine1.NoD


(note that the lines do end with CRLF).  That's the end of the message,
and the end of the INBOX too.

Now, if I look at the mail delivery logs around that time, I see
many messages before 10:47:47 that could not be delivered to studentx,
all because of quota issues:


Dec 24 10:35:02 imapN dmail[29685]: delivering to studentx+INBOX
Dec 24 10:35:02 imapN dmail[29685]: Verifying safe delivery to
/home/student/ndsu/28/05/studentx/INBOX
Dec 24 10:35:02 imapN dmail[29685]: mbx appending to #driver.mbx/INBOX (file
/home/student/ndsu/28/05/studentx/INBOX)
Dec 24 10:35:02 imapN dmail[29685]: Message append failed: Disk quota
exceeded
Dec 24 10:35:02 imapN dmail[29685]: message delivery failed to
/home/student/ndsu/28/05/studentx/INBOX
Dec 24 10:35:02 imapN sendmail[9736]: kBKD4ol3017740:
to=<[EMAIL PROTECTED]>, delay=4+03:30:12, xdelay=00:00:02,
mailer=local, pri=8493002, dsn=4.0.0, stat=Deferred: local mailer
(/usr/bin/procmail) exited with EX_TEMPFAIL

Dec 24 10:45:41 imapN dmail[30729]: delivering to studentx+INBOX
Dec 24 10:45:41 imapN dmail[30729]: Verifying safe delivery to
/home/student/ndsu/28/05/studentx/INBOX
Dec 24 10:45:41 imapN dmail[30729]: mbx appending to #driver.mbx/INBOX (file
/home/student/ndsu/28/05/studentx/INBOX)
Dec 24 10:45:41 imapN dmail[30729]: Message append failed: Disk quota
exceeded
Dec 24 10:45:41 imapN dmail[30729]: message delivery failed to
/home/student/ndsu/28/05/studentx/INBOX
Dec 24 10:45:41 imapN sendmail[25358]: kBMDvXSM026889:
to=<[EMAIL PROTECTED]>, delay=2+02:48:08, xdelay=00:00:02,
mailer=local, pri=4551535, dsn=4.0.0, stat=Deferred: local mailer
(/usr/bin/procmail) exited with EX_TEMPFAIL

Dec 24 10:47:34 imapN dmail[31015]: delivering to studentx+INBOX
Dec 24 10:47:34 imapN dmail[31015]: Verifying safe delivery to
/home/student/ndsu/28/05/studentx/INBOX
Dec 24 10:47:34 imapN dmail[31015]: mbx appending to #driver.mbx/INBOX (file
/home/student/ndsu/28/05/studentx/INBOX)
Dec 24 10:47:34 imapN dmail[31015]: Message append failed: Disk quota
exceeded
Dec 24 10:47:34 imapN dmail[31015]: message delivery failed to
/home/student/ndsu/28/05/studentx/INBOX
Dec 24 10:47:34 imapN sendmail[25358]: kBMDI9lG020084:
to=<[EMAIL PROTECTED]>, delay=2+03:29:25, xdelay=00:00:02,
mailer=local, pri=4712350, dsn=4.0.0, stat=Deferred: local mailer
(/usr/bin/procmail) exited with EX_TEMPFAIL

Dec 24 10:47:36 imapN dmail[31018]: delivering to studentx+INBOX
Dec 24 10:47:36 imapN dmail[31018]: Verifying safe delivery to
/home/student/ndsu/28/05/studentx/INBOX
Dec 24 10:47:36 imapN dmail[31018]: mbx appending to #driver.mbx/INBOX (file
/home/student/ndsu/28/05/studentx/INBOX)
Dec 24 10:47:36 imapN dmail[31018]: Message append failed: Disk quota
exceeded
Dec 24 10:47:36 imapN dmail[31018]: message delivery failed to
/home/student/ndsu/28/05/studentx/INBOX
Dec 24 10:47:36 imapN sendmail[29995]: kBO36tx6022357:
to=<[EMAIL PROTECTED]>, delay=13:40:41, xdelay=00:00:02,
mailer=local, pri=1352781, dsn=4.0.0, stat=Deferred: local mailer
(/usr/bin/procmail) exited with EX_TEMPFAIL


(those are just a sample, there were actually more message delivery
attempts, all of which failed with "Disk quota exceeded").

Now here's the delivery for the message that was partially written into
the INBOX, corrupting it:


Dec 24 10:47:47 imapN dmail[31058]: delivering to studentx+INBOX
Dec 24 10:47:47 imapN dmail[31058]: Verifying safe delivery to
/home/student/ndsu/28/05/studentx/INBOX
Dec 24 10:47:47 imapN dmail[31058]: mbx appending to #driver.mbx/INBOX (file
/home/student/ndsu/28/05/studentx/INBOX)
Dec 24 10:47:47 imapN dmail[31058]: delivered to
/home/student/ndsu/28/05/studentx/INBOX
Dec 24 10:47:47 imapN dmail[31058]: Verifying safe delivery to
/home/student/ndsu/28/05/studentx/INBOX
Dec 24 10:47:47 imapN sendmail[25358]: kBMC4iFe002130:
to=<[EMAIL PROTECTED]>, delay=2+04:43:02, xdelay=00:00:02,
mailer=local, pri=4714522, dsn=2.0.0, stat=Sent


That one says it succeeded.  It's again followed by delivery failures:


Dec 24 10:48:17 imapN dmail[31147]: delivering to studentx+INBOX
Dec 24 10:48:17 imapN dmail[31147]: Verifying safe delivery to
/home/student/ndsu/28/05/studentx/INBOX
Dec 24 10:48:17 imapN dmail[31147]: mbx appending to #driver.mbx/INBOX (file
/home/student/ndsu/28/05/studentx/INBOX)
Dec 24 10:48:17 imapN dmail[31147]: Message append failed: Disk quota
exceeded
Dec 24 10:48:17 imapN dmail[31147]: message delivery failed to
/home/student/ndsu/28/05/studentx/INBOX
Dec 24 10:48:17 imapN sendmail[29995]: kBNLdvD2025628:
to=<[EMAIL PROTECTED]>, delay=19:08:20, xdelay=00:00:02,
mailer=local, pri=1772700, dsn=4.0.0, stat=Deferred: local mailer
(/usr/bin/procmail) exited with EX_TEMPFAIL



We don't know the sizes of the individual messages, though we can guess
based on the priority (pri=) in the log messages.  We also can tell based
on the delay that this person has been over quota for quite a while, and I
can't find any record of that person logging in (so it's not like they
logged in, deleted some messages to get under quota, and then had this one
message successfully delivered to their INBOX).


I'm not certain the problem is dmail from 2006d, but the best evidence I
have so far points in that direction.

Is anyone else using 2006d and seeing a marked increase in MBX corruption,
especially for people that are over their quota?

Tim
--
Tim Mooney                              [EMAIL PROTECTED]
Information Technology Services         (701) 231-1076 (Voice)
Room 242-J6, IACC Building              (701) 231-8541 (Fax)
North Dakota State University, Fargo, ND 58105-5164
_______________________________________________
Imap-uw mailing list
[email protected]
https://mailman1.u.washington.edu/mailman/listinfo/imap-uw

Reply via email to