[My apologies if you get this message twice. I accidentally sent an earlier copy from an address different from the one I'm subscribed under]
All- We've been using MBX as our default mailbox format for a couple years, and have been relatively happy with it (though we're excited by MIX and will likely move to it next summer). We had been using 2004g for quite a while, and although we get the occasional corrupt INBOX, we've almost always been able to pin it down to a problem on the host, or with our SAN etc. When we have seen corrupt MBX INBOXes in the past, it's almost always been the "Unable to find CRLF at" variety, and I wrote some tools a while back that automates most of fixing that. Our version of 2004g was compiled to use CREATEPROTO=mbxproto, and we're using dmail (invoked via procmail) for delivery. On Sunday, December 24th, I upgraded several (but not all) of our IMAP servers to 2006d, also compiled with CREATEPROTO=mbxproto. Nothing else was changed. Since then, we've seen a substantial increase in the number of corrupt MBX INBOXes on the hosts that I upgraded to 2006d. The hosts that are still running 2004g haven't seen any increase at all. In addition, the corruption has mainly been a variety that I was aware of but that we haven't seen before: Last message (at 109914528) runs past end of file (129188466 > 109916160) Since we're on semester break, many of our customers are running into quota issues and by this point most of them that are over quota have gone past the quota grace period. AFAICT, every single instance of MBX corruption with the new 2006d dmail/tmail is for a person that was probably over quota when the corruption happened. Here's an example. One of our students, "studentx", had this when I checked his INBOX: sh-2.05$ /usr/local/sbin/mailutil check INBOX Last message (at 124944185) runs past end of file (124948983 > 124944384) Splitting INBOX at the corruption point, the head is good, and the tail looks like this: 24-Dec-2006 10:47:47 -0600,4743;000000000000-000058ed Received: via dmail-2006d.13 for studentx; Sun, 24 Dec 2006 10:47:47 -0600 (CST) Return-Path: <[EMAIL PROTECTED]> Received: from vaccine1.NoD (note that the lines do end with CRLF). That's the end of the message, and the end of the INBOX too. Now, if I look at the mail delivery logs around that time, I see many messages before 10:47:47 that could not be delivered to studentx, all because of quota issues: Dec 24 10:35:02 imapN dmail[29685]: delivering to studentx+INBOX Dec 24 10:35:02 imapN dmail[29685]: Verifying safe delivery to /home/student/ndsu/28/05/studentx/INBOX Dec 24 10:35:02 imapN dmail[29685]: mbx appending to #driver.mbx/INBOX (file /home/student/ndsu/28/05/studentx/INBOX) Dec 24 10:35:02 imapN dmail[29685]: Message append failed: Disk quota exceeded Dec 24 10:35:02 imapN dmail[29685]: message delivery failed to /home/student/ndsu/28/05/studentx/INBOX Dec 24 10:35:02 imapN sendmail[9736]: kBKD4ol3017740: to=<[EMAIL PROTECTED]>, delay=4+03:30:12, xdelay=00:00:02, mailer=local, pri=8493002, dsn=4.0.0, stat=Deferred: local mailer (/usr/bin/procmail) exited with EX_TEMPFAIL Dec 24 10:45:41 imapN dmail[30729]: delivering to studentx+INBOX Dec 24 10:45:41 imapN dmail[30729]: Verifying safe delivery to /home/student/ndsu/28/05/studentx/INBOX Dec 24 10:45:41 imapN dmail[30729]: mbx appending to #driver.mbx/INBOX (file /home/student/ndsu/28/05/studentx/INBOX) Dec 24 10:45:41 imapN dmail[30729]: Message append failed: Disk quota exceeded Dec 24 10:45:41 imapN dmail[30729]: message delivery failed to /home/student/ndsu/28/05/studentx/INBOX Dec 24 10:45:41 imapN sendmail[25358]: kBMDvXSM026889: to=<[EMAIL PROTECTED]>, delay=2+02:48:08, xdelay=00:00:02, mailer=local, pri=4551535, dsn=4.0.0, stat=Deferred: local mailer (/usr/bin/procmail) exited with EX_TEMPFAIL Dec 24 10:47:34 imapN dmail[31015]: delivering to studentx+INBOX Dec 24 10:47:34 imapN dmail[31015]: Verifying safe delivery to /home/student/ndsu/28/05/studentx/INBOX Dec 24 10:47:34 imapN dmail[31015]: mbx appending to #driver.mbx/INBOX (file /home/student/ndsu/28/05/studentx/INBOX) Dec 24 10:47:34 imapN dmail[31015]: Message append failed: Disk quota exceeded Dec 24 10:47:34 imapN dmail[31015]: message delivery failed to /home/student/ndsu/28/05/studentx/INBOX Dec 24 10:47:34 imapN sendmail[25358]: kBMDI9lG020084: to=<[EMAIL PROTECTED]>, delay=2+03:29:25, xdelay=00:00:02, mailer=local, pri=4712350, dsn=4.0.0, stat=Deferred: local mailer (/usr/bin/procmail) exited with EX_TEMPFAIL Dec 24 10:47:36 imapN dmail[31018]: delivering to studentx+INBOX Dec 24 10:47:36 imapN dmail[31018]: Verifying safe delivery to /home/student/ndsu/28/05/studentx/INBOX Dec 24 10:47:36 imapN dmail[31018]: mbx appending to #driver.mbx/INBOX (file /home/student/ndsu/28/05/studentx/INBOX) Dec 24 10:47:36 imapN dmail[31018]: Message append failed: Disk quota exceeded Dec 24 10:47:36 imapN dmail[31018]: message delivery failed to /home/student/ndsu/28/05/studentx/INBOX Dec 24 10:47:36 imapN sendmail[29995]: kBO36tx6022357: to=<[EMAIL PROTECTED]>, delay=13:40:41, xdelay=00:00:02, mailer=local, pri=1352781, dsn=4.0.0, stat=Deferred: local mailer (/usr/bin/procmail) exited with EX_TEMPFAIL (those are just a sample, there were actually more message delivery attempts, all of which failed with "Disk quota exceeded"). Now here's the delivery for the message that was partially written into the INBOX, corrupting it: Dec 24 10:47:47 imapN dmail[31058]: delivering to studentx+INBOX Dec 24 10:47:47 imapN dmail[31058]: Verifying safe delivery to /home/student/ndsu/28/05/studentx/INBOX Dec 24 10:47:47 imapN dmail[31058]: mbx appending to #driver.mbx/INBOX (file /home/student/ndsu/28/05/studentx/INBOX) Dec 24 10:47:47 imapN dmail[31058]: delivered to /home/student/ndsu/28/05/studentx/INBOX Dec 24 10:47:47 imapN dmail[31058]: Verifying safe delivery to /home/student/ndsu/28/05/studentx/INBOX Dec 24 10:47:47 imapN sendmail[25358]: kBMC4iFe002130: to=<[EMAIL PROTECTED]>, delay=2+04:43:02, xdelay=00:00:02, mailer=local, pri=4714522, dsn=2.0.0, stat=Sent That one says it succeeded. It's again followed by delivery failures: Dec 24 10:48:17 imapN dmail[31147]: delivering to studentx+INBOX Dec 24 10:48:17 imapN dmail[31147]: Verifying safe delivery to /home/student/ndsu/28/05/studentx/INBOX Dec 24 10:48:17 imapN dmail[31147]: mbx appending to #driver.mbx/INBOX (file /home/student/ndsu/28/05/studentx/INBOX) Dec 24 10:48:17 imapN dmail[31147]: Message append failed: Disk quota exceeded Dec 24 10:48:17 imapN dmail[31147]: message delivery failed to /home/student/ndsu/28/05/studentx/INBOX Dec 24 10:48:17 imapN sendmail[29995]: kBNLdvD2025628: to=<[EMAIL PROTECTED]>, delay=19:08:20, xdelay=00:00:02, mailer=local, pri=1772700, dsn=4.0.0, stat=Deferred: local mailer (/usr/bin/procmail) exited with EX_TEMPFAIL We don't know the sizes of the individual messages, though we can guess based on the priority (pri=) in the log messages. We also can tell based on the delay that this person has been over quota for quite a while, and I can't find any record of that person logging in (so it's not like they logged in, deleted some messages to get under quota, and then had this one message successfully delivered to their INBOX). I'm not certain the problem is dmail from 2006d, but the best evidence I have so far points in that direction. Is anyone else using 2006d and seeing a marked increase in MBX corruption, especially for people that are over their quota? Tim -- Tim Mooney [EMAIL PROTECTED] Information Technology Services (701) 231-1076 (Voice) Room 242-J6, IACC Building (701) 231-8541 (Fax) North Dakota State University, Fargo, ND 58105-5164 _______________________________________________ Imap-uw mailing list [email protected] https://mailman1.u.washington.edu/mailman/listinfo/imap-uw
