My mail server consumed all of its configured file table slots and started returning ENFILE ("Too many open files in system") for many operations. This wreaked havoc with dovecot-1.1.13. Here are some areas where dovecot should detect and more gracefully handle ENFILE error returns:

1. Deliver should report an error more helpful than "Unknown internal error" when fdatasync_path fails: deliver(user): Apr 10 14:53:20 Error: fdatasync_path(/Volumes/MailData/ user/new) failed: Too many open files in system deliver(user): Apr 10 14:53:20 Info: msgid=<...>: save failed to INBOX: BUG: Unknown internal error

2. Index files should not be fscked when they could not be opened due to ENFILE: Error: IMAP(user): open() failed with index file /Volumes/MailData/ user/.Deleted Messages/dovecot.index: Too many open files in system Warning: IMAP(user): fscking index file /Volumes/MailData/ user/.Deleted Messages/dovecot.index

3. Dovecot should not crash when mail transaction logs become corrupt, or appear to have become corrupt due to ENFILE. Here is a sequence of error messages from the log but I can't tell whether the IMAP ones up through the panic in the middle are all from the same process, or from different processes.
deliver(user): Apr 10 14:55:32 Info: msgid=<...>: saved mail to INBOX
deliver(user): Apr 10 14:55:54 Error: open() failed with index file / Volumes/MailData/user/dovecot.index: Too many open files in system deliver(user): Apr 10 14:55:54 Error: open() failed with file /Volumes/ MailData/user/dovecot.index.log.2: Too many open files in system deliver(user): Apr 10 14:55:57 Error: open() failed with index cache file /Volumes/MailData/user/dovecot.index.cache: Too many open files in system dovecot: Apr 10 14:56:08 Warning: IMAP(user): fscking index file / Volumes/MailData/user/dovecot.index dovecot: Apr 10 14:56:16 Error: IMAP(user): Fixed index file /Volumes/ MailData/user/dovecot.index: log_file_seq 15 -> 2
deliver(user): Apr 10 14:56:19 Info: msgid=<...>: saved mail to INBOX
deliver(user): Apr 10 14:56:20 Error: Corrupted transaction log file / Volumes/MailData/user/dovecot.index.log seq 2: record size too small (type=0x0, offset=23088, size=0) (sync_offset=36848) deliver(user): Apr 10 14:56:20 Warning: fscking index file (in-memory index) deliver(user): Apr 10 14:56:21 Info: msgid=<...>: save failed to INBOX: Internal error occurred. Refer to server log for more information. [2009-04-10 14:56:20] dovecot: Apr 10 14:56:40 Panic: IMAP(user): file mail-transaction-log- view.c: line 103 (mail_transaction_log_view_set): assertion failed: (min_file_seq <= max_file_seq) dovecot: Apr 10 14:56:40 Error: IMAP(user): Raw backtrace: [see below for better backtrace] deliver(user): Apr 10 14:56:45 Error: Transaction log file /Volumes/ MailData/user/dovecot.index.log: marked corrupted
deliver(user): Apr 10 14:57:04 Info: msgid=<...>: saved mail to INBOX
deliver(user): Apr 10 14:57:23 Info: msgid=<...>: saved mail to INBOX
deliver(user): Apr 10 14:57:49 Info: msgid=<...>: saved mail to INBOX
deliver(user): Apr 10 14:58:21 Info: msgid=<...>: saved mail to INBOX
deliver(user): Apr 10 14:58:41 Info: msgid=<...>: saved mail to INBOX
deliver(user): Apr 10 14:59:09 Info: msgid=<...>: saved mail to INBOX
deliver(user): Apr 10 14:59:28 Info: msgid=<...>: saved mail to INBOX
deliver(user): Apr 10 14:59:43 Info: msgid=<...>: saved mail to INBOX
deliver(user): Apr 10 15:00:57 Info: msgid=<...>: saved mail to INBOX
deliver(user): Apr 10 15:03:01 Info: msgid=<...>: saved mail to INBOX
deliver(user): Apr 10 15:03:27 Info: msgid=<...>: saved mail to INBOX
dovecot: Apr 10 15:21:41 Error: IMAP(user): Corrupted transaction log file /Volumes/MailData/user/dovecot.index.log seq 2: Invalid transaction log size (36848 vs 23088): /Volumes/MailData/user/ dovecot.index.log (sync_offset=36848) dovecot: Apr 10 15:21:42 Info: imap-login: Login: user=<user>, method=CRAM-MD5, rip=..., lip=..., TLS

The backtrace for the assertion failure was:
0   libSystem.B.dylib                   0x00007fff8007c102 __kill + 10
1   libSystem.B.dylib                   0x00007fff800f7ffc abort + 83
2 imap 0x000000010006a22e default_fatal_finish + 70 3 imap 0x000000010006a27c i_syslog_fatal_handler + 0
4   imap                                0x00000001000699db i_info + 0
5 imap 0x0000000100058542 mail_transaction_log_view_set + 104 6 imap 0x0000000100053b4c view_sync_set_log_view_range + 125 7 imap 0x0000000100053fc5 mail_index_view_sync_begin + 378 8 imap 0x000000010003c26e index_mailbox_sync_init + 129 9 imap 0x0000000100018457 maildir_storage_sync_init + 229 10 imap 0x000000010000c6bb imap_sync_init + 132 11 imap 0x000000010000c90d cmd_sync_delayed + 477 12 imap 0x000000010000716a client_handle_input + 409
13  imap                                0x0000000100007244 client_input + 148
14 imap 0x0000000100071228 io_loop_handler_run + 272
15  imap                                0x00000001000705ed io_loop_run + 55
16  imap                                0x000000010000e219 main + 1635
17  imap                                0x00000001000016c0 start + 52

I have since configured more open files, and enabled core dumps and added the pid to the mail_log_prefix for better tracking, but this is all I have to work with for now. Too bad the values of min_file_seq and max_file_seq are not logged in the panic.

Reply via email to