Kind reminder :-) D.
On Thu 13/02/2014 15:52, Dimos Alevizos wrote: > At last, I was finally able to find time and run a backtrace, I've > attached the file. > If you don't remember (and hey, you can blame you after all this > time) the problem is only with a couple (~10) of users among the > hundreds of thousands currently using the service. > > Please let me know if I should provide any other info, logs or whatever. > > Dimos Alevizos > > -------- Original Message -------- > Subject: Re: [Dovecot] Mbox corruption - Inbox beginning with > 'FFrom' or 'FrFrom' > From: Timo Sirainen <[email protected]> > To: Dimos Alevizos <[email protected]> > CC: Dimitris Paouris <[email protected]>, Dovecot Mailing List > <[email protected]> > Date: 02/11/2013 01:16 μμ > > >Well, if that patch didn’t work, then the problem is elsewhere. There aren’t > >many other good possibilities left though.. How about adding this patch, it > >should be even safer than the previous one: > > > >http://hg.dovecot.org/dovecot-2.2/rev/d3062d066593 > > > >On 30.10.2013, at 12.42, Dimos Alevizos <[email protected]> wrote: > > > >>I'm afraid it doesn't seem to be working. > >>I've compiled a patched 2.2.6 dovecot with the patch you sent and installed > >>it in a production server (had to be 2.2.6 cause we've upgraded all the > >>rest since I begun this thread months ago) and although we still have mbox > >>corruptions (rarely as before) the server isn't crashing : > >> > >>Oct 30 11:15:19 pop04 dovecot: pop3-login: Login: user=<[email protected]>, > >>method=PLAIN, rip=85.72.232.35, lip=83.235.66.43, mpid=24419, secured, > >>session=<+0ywxfHpIQBVSOgj> > >>Oct 30 11:15:20 pop04 dovecot: pop3([email protected]): Disconnected: > >>Logged out top=0/0, retr=0/0, del=0/1336, size=471029518 > >>Oct 30 11:19:12 pop04 dovecot: lmtp(2863, [email protected]): > >>r7U3KnyhcFIvCwAAckDtvw: > >>msgid=<[email protected]>: > >>size=17823 saved mail to INBOX > >>Oct 30 11:33:12 pop04 dovecot: pop3-login: Login: user=<[email protected]>, > >>method=PLAIN, rip=85.72.224.94, lip=83.235.66.43, mpid=600, secured, > >>session=<vT2aBfLpxQBVSOBe> > >>Oct 30 11:33:12 pop04 dovecot: pop3([email protected]): Error: Syncing > >>INBOX failed: Mailbox isn't a valid mbox file > >>Oct 30 11:33:12 pop04 dovecot: pop3([email protected]): Error: Couldn't > >>init INBOX: Mailbox isn't a valid mbox file > >>Oct 30 11:33:12 pop04 dovecot: pop3([email protected]): Mailbox init failed > >>top=0/0, retr=0/0, del=0/0, size=0 > >>Oct 30 11:33:33 pop04 dovecot: lmtp(16314, [email protected]): > >>Au4vIMqucFK6PwAAckDtvw: msgid=<[email protected]>: > >>size=7975817 save failed to INBOX: Mailbox isn't a valid mbox file > >> > >>Perhaps the patch is only valid for 2.1.16 and needs to be modified for > >>2.2.6 ? > >> > >>Thank you for your time > >>Dimos Alevizos > >> > >> > >>-------- Original Message -------- > >>Subject: Re: [Dovecot] Mbox corruption - Inbox beginning with 'FFrom' or > >>'FrFrom' > >>From: Timo Sirainen <[email protected]> > >>To: Dimos Alevizos <[email protected]> > >>CC: [email protected], Dimitris Paouris <[email protected]> > >>Date: 26/06/2013 06:59 μμ > >> > >>>It crashes one specific IMAP/POP3 session, so others are unaffected. The > >>>potential problems: > >>> > >>> * It might cause the user's mbox to become crashing constantly. so first > >>> crash -> client reconnects -> client attempts the same operation -> crash > >>> again. Then again, this might not happen, it depends. > >>> > >>> * The mbox file would probably become slightly more corrupted than > >>> normally, because it doesn't finish moving data around. No data should > >>> get actually lost, but some parts could become duplicated (e.g. some > >>> headers or even mails, possibly causing UID renumbering = redownloading). > >>> > >>>So not ideal in production, but shouldn't be too bad either, especially if > >>>you just wait for the first crash and then immediately switch to the old > >>>unpatched version. > >>> > >>>On 26.6.2013, at 10.21, Dimos Alevizos <[email protected]> wrote: > >>> > >>>>Hi, > >>>> > >>>>I haven't had the time to compile it yet, but a question just occurred. > >>>>Given that it's so rare and we can't reproduce it on a dev server, how > >>>>safe is this to use on a production server ? > >>>>When you say "crash" you mean the whole dovecot server or that specific > >>>>client's child ? > >>>> > >>>>D. > >>>> > >>>>-------- Original Message -------- > >>>>Subject: Re: [Dovecot] Mbox corruption - Inbox beginning with 'FFrom' or > >>>>'FrFrom' > >>>>From: dalevizo <[email protected]> > >>>>To: Timo Sirainen <[email protected]> > >>>>CC: [email protected], Dimitris Paouris <[email protected]> > >>>>Date: 24/06/2013 01:41 μμ > >>>> > >>>>>Thanx I'll try the patch as soon as possible and I'll let you know. > >>>>>It is indeed very rare. We're only seeing 4-5 corruptions in about 13 > >>>>>million logins per day. > >>>>>I've been trying to convince our design team that we should move to > >>>>>maildir, but the truth is that it's quite a change, and we're way too > >>>>>busy to deal with everything else AND a migration from mbox to maildir. > >>>>> > >>>>>D. > >>>>> > >>>>>On Mon 24/06/2013 13:16, Timo Sirainen wrote: > >>>>>>On 19.6.2013, at 16.00, Dimos Alevizos <[email protected]> wrote: > >>>>>> > >>>>>>>we're having some problems with our dovecot setup. > >>>>>>>I've seen similar problems in the mailing list some years ago but alas > >>>>>>>wasn't able to find a solution. > >>>>>>> > >>>>>>>Our setup is as follows : > >>>>>>>An MX farm (postfix) sends mails via LMTP to a director farm (dovecot > >>>>>>>2.1.12) which proxies pop3/imap/lmtp traffic to a dovecot farm > >>>>>>>(dovecot 2.1.16). > >>>>>>>All mailboxes and indexes are on NFS and all servers are Centos. > >>>>>>> > >>>>>>>The problem is that at times we see mailboxes (all of them are in mbox > >>>>>>>format) beginning with FFrom or FrFrom and of course dovecot says it's > >>>>>>>not a valid mbox file. > >>>>>> > >>>>>>This is quite an old bug, but it happens rarely enough that I haven't > >>>>>>been able to reproduce and fix it. Actually people hadn't complained > >>>>>>about it for a long time now, so I had assumed it had somehow gotten > >>>>>>fixed already. > >>>>>> > >>>>>>With the attached debug patch it should crash instead of (completely) > >>>>>>corrupting the mbox file. Debugging the resulting core file with gdb > >>>>>>could be useful in figuring this out. > >>>>>> > >>>>>>Although I wouldn't recommend mbox format for any big installation > >>>>>>anyway.. > >>>>>> > >>>>> > >>>>> > >>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > > > Feb 13 15:41:08 pop06 dovecot: imap-login: Login: user=<someuser@somedomain>, > method=PLAIN, rip=195.170.0.170, lip=83.235.66.45, mpid=8157, > session=<E/1110nyQADDqgCq> > Feb 13 15:41:09 pop06 dovecot: imap(someuser@somedomain): Panic: file > mbox-sync.c: line 655 (mbox_sync_handle_header): assertion failed: > (mail_ctx->mail.from_offset + move_diff != 1 && mail_ctx->mail.from_offset + > move_diff != 2) > Feb 13 15:41:09 pop06 dovecot: imap(someuser@somedomain): Error: Raw > backtrace: /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf551923fa] -> > /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf55192466] -> > /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf55191583] -> > /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0 [0x2adf54e7e9a7] -> > /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0(mbox_sync+0x620) > [0x2adf54e7f160] -> > /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0(mbox_storage_sync_init+0x83) > [0x2adf54e80093] -> > /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0(mailbox_sync_init+0x31) > [0x2adf54e99da1] -> > /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0(mailbox_sync+0x27) > [0x2adf54e9abc7] -> dovecot/imap [someuser@somedomain 195.170.0.170 > SELECT](cmd_select_full+0x219) [0x4111c9] -> dovecot/imap > [someuser@somedomain 195.170.0.170 SELECT](cmd_select+0xb) [0x411afb] -> > dovecot/imap [someuser@somedomain 195.170.0.170 SELECT](command_exec+0x37) > [0x4165e7] -> dovecot/imap [d > Feb 13 15:41:09 pop06 dovecot: imap(someuser@somedomain): Fatal: master: > service(imap): child 8157 killed with signal 6 (core dumped) > > root@pop06:someuser $ gdb /opt/dovecot-2.2.9/libexec/imap core.8157 > GNU gdb (GDB) CentOS (7.0.1-45.el5.centos) > Copyright (C) 2009 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-redhat-linux-gnu". > For bug reporting instructions, please see: > <http://www.gnu.org/software/gdb/bugs/>... > Reading symbols from /opt/dovecot-2.2.9/libexec/imap...done. > [New Thread 8157] > Reading symbols from > /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0...done. > Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0 > Reading symbols from /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0...done. > Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 > Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done. > Loaded symbols for /lib64/libc.so.6 > Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done. > Loaded symbols for /lib64/librt.so.1 > Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done. > Loaded symbols for /lib64/libdl.so.2 > Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols > found)...done. > Loaded symbols for /lib64/ld-linux-x86-64.so.2 > Reading symbols from /lib64/libpthread.so.0...(no debugging symbols > found)...done. > [Thread debugging using libthread_db enabled] > Loaded symbols for /lib64/libpthread.so.0 > Reading symbols from > /opt/dovecot-2.2.9/lib/dovecot/lib10_quota_plugin.so...done. > Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib10_quota_plugin.so > Reading symbols from > /opt/dovecot-2.2.9/lib/dovecot/lib11_imap_quota_plugin.so...done. > Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib11_imap_quota_plugin.so > Reading symbols from > /opt/dovecot-2.2.9/lib/dovecot/lib15_notify_plugin.so...done. > Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib15_notify_plugin.so > Reading symbols from > /opt/dovecot-2.2.9/lib/dovecot/lib20_fts_plugin.so...done. > Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib20_fts_plugin.so > Reading symbols from > /opt/dovecot-2.2.9/lib/dovecot/lib21_fts_solr_plugin.so...done. > Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib21_fts_solr_plugin.so > Reading symbols from /lib64/libexpat.so.0...(no debugging symbols > found)...done. > Loaded symbols for /lib64/libexpat.so.0 > Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols > found)...done. > Loaded symbols for /lib64/libgcc_s.so.1 > > warning: no loadable sections found in added symbol-file system-supplied DSO > at 0x7fff49dfd000 > Core was generated by `dovecot/imap [someuser@somed'. > Program terminated with signal 6, Aborted. > #0 0x0000003efe030265 in raise () from /lib64/libc.so.6 > (gdb) bt full > #0 0x0000003efe030265 in raise () from /lib64/libc.so.6 > No symbol table info available. > #1 0x0000003efe031d10 in abort () from /lib64/libc.so.6 > No symbol table info available. > #2 0x00002adf5519241e in default_fatal_finish (type=<value optimized out>, > status=0) at failures.c:192 > backtrace = 0x14d90370 > "/opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf551923fa] -> > /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf55192466] -> > /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf55191583] ->"... > #3 0x00002adf55192466 in i_internal_fatal_handler (ctx=0x7fff49d91340, > format=<value optimized out>, args=<value optimized out>) at failures.c:653 > status = 0 > #4 0x00002adf55191583 in i_panic (format=0x1fdd <Address 0x1fdd out of > bounds>) at failures.c:264 > ctx = {type = LOG_TYPE_PANIC, exit_status = 0, timestamp = 0x0} > args = {{gp_offset = 40, fp_offset = 48, overflow_arg_area = > 0x7fff49d91410, reg_save_area = 0x7fff49d91350}} > #5 0x00002adf54e7e9a7 in mbox_sync_handle_header (sync_ctx=0x7fff49d935e0, > mail_ctx=0x7fff49d937d0, partial=true) at mbox-sync.c:654 > orig_from_offset = 14200194 > move_diff = -14200194 > ret = <value optimized out> > __FUNCTION__ = "mbox_sync_handle_header" > #6 mbox_sync_loop (sync_ctx=0x7fff49d935e0, mail_ctx=0x7fff49d937d0, > partial=true) at mbox-sync.c:1138 > rec = 0x14dc0d20 > uid = <value optimized out> > messages_count = 302 > offset = 14200194 > ret = <value optimized out> > expunged = false > skipped_mails = false > uids_broken = false > #7 0x00002adf54e7f160 in mbox_sync_int (mbox=0x14db5b30, > flags=MBOX_SYNC_UNDIRTY) at mbox-sync.c:1619 > index_sync_ctx = 0x14dd3710 > sync_view = 0x14dd3780 > ret = <value optimized out> > delay_writes = true > readonly = false > trans = 0x14dbf420 > sync_ctx = {mbox = 0x14db5b30, flags = MBOX_SYNC_UNDIRTY, input = > 0x14dd4fe8, file_input = 0x14dd4e18, write_fd = 15, orig_mtime = 1392290551, > orig_atime = 1391079897, > orig_size = 29081201, last_stat = {st_dev = 25, st_ino = 30254297, > st_nlink = 1, st_mode = 33152, st_uid = 1904649, st_gid = 550, pad0 = 0, > st_rdev = 0, st_size = 29081201, > st_blksize = 32768, st_blocks = 56920, st_atim = {tv_sec = > 1391079897, tv_nsec = 0}, st_mtim = {tv_sec = 1392290551, tv_nsec = 0}, > st_ctim = {tv_sec = 1392290551, > tv_nsec = 961280000}, __unused = {0, 0, 0}}, index_sync_ctx = > 0x14dd3710, sync_view = 0x14dd3780, t = 0x14dbf420, reset_hdr = > {major_version = 0 '\000', minor_version = 0 '\000', > base_header_size = 0, header_size = 0, record_size = 0, > compat_flags = 0 '\000', unused = "\000\000", indexid = 0, flags = 0, > uid_validity = 0, next_uid = 0, messages_count = 0, > unused_old_recent_messages_count = 0, seen_messages_count = 0, > deleted_messages_count = 0, first_recent_uid = 0, first_unseen_uid_lowwater = > 0, first_deleted_uid_lowwater = 0, > log_file_seq = 0, log_file_tail_offset = 0, log_file_head_offset > = 0, unused_old_sync_size = 0, unused_old_sync_stamp = 0, day_stamp = 0, > day_first_uid = {0, 0, 0, 0, 0, 0, 0, 0}}, > hdr = 0x14dbf090, header = 0x14dcb070, from_line = 0x14dcd100, > base_uid_validity = 1387910574, base_uid_last = 439, base_uid_last_offset = > 0, mails = {arr = {buffer = 0x14dcb0b0, > element_size = 64}, v = 0x14dcb0b0, v_modifiable = 0x14dcb0b0}, > sync_changes = 0x14dcb0f0, mail_keyword_pool = 0x14dbf780, > saved_keywords_pool = 0x14de7480, prev_msg_uid = 242, > next_uid = 440, idx_next_uid = 440, seq = 105, idx_seq = 105, > need_space_seq = 0, last_nonrecent_uid = 0, expunged_space = 14200194, > space_diff = 0, dest_first_mail = 1, > first_mail_crlf_expunged = 0, keep_recent = 0, readonly = 0, > delay_writes = 1, renumber_uids = 0, moved_offsets = 0, ext_modified = 0, > index_reset = 0, errors = 0} > sync_flags = <value optimized out> > changed = <value optimized out> > __FUNCTION__ = "mbox_sync_int" > #8 mbox_sync (mbox=0x14db5b30, flags=MBOX_SYNC_UNDIRTY) at mbox-sync.c:2000 > lock_id = 3 > ret = <value optimized out> > __FUNCTION__ = "mbox_sync" > #9 0x00002adf54e80093 in mbox_storage_sync_init (box=0x14db5b30, flags=65) > at mbox-sync.c:2049 > mbox = 0x1fdd > mbox_sync_flags = 8157 > ret = 0 > #10 0x00002adf54e99da1 in mailbox_sync_init (box=0x14db5b30, flags=65) at > mail-storage.c:1668 > _data_stack_cur_id = 4 > ctx = <value optimized out> > #11 0x00002adf54e9abc7 in mailbox_sync (box=0x1fdd, flags=8157) at > mail-storage.c:1716 > ctx = 0x14d99110 > status = {sync_delayed_expunges = 0} > #12 0x00000000004111c9 in select_open (cmd=0x14db3310, readonly=false) at > cmd-select.c:313 > No locals. > #13 cmd_select_full (cmd=0x14db3310, readonly=false) at cmd-select.c:439 > client = 0x14db2b00 > ctx = 0x14db3408 > args = 0x14d9ba98 > list_args = 0x101010101010101 > mailbox = 0x14d902b8 "INBOX" > error = <value optimized out> > ret = <value optimized out> > __FUNCTION__ = "cmd_select_full" > ---Type <return> to continue, or q <return> to quit--- > #14 0x0000000000411afb in cmd_select (cmd=0x1fdd) at cmd-select.c:448 > No locals. > #15 0x00000000004165e7 in command_exec (cmd=0x14db3310) at imap-commands.c:158 > hook = 0x14d99110 > ret = <value optimized out> > #16 0x00000000004153db in client_command_input (cmd=0x1fdd) at > imap-client.c:780 > client = 0x14db2b00 > command = <value optimized out> > __FUNCTION__ = "client_command_input" > #17 0x0000000000415495 in client_command_input (cmd=0x14db3310) at > imap-client.c:841 > client = 0x14db2b00 > command = 0x6 > __FUNCTION__ = "client_command_input" > #18 0x000000000041564f in client_handle_next_command (client=0x14db2b00) at > imap-client.c:879 > No locals. > #19 client_handle_input (client=0x14db2b00) at imap-client.c:891 > _data_stack_cur_id = 3 > remove_io = <value optimized out> > handled_commands = false > __FUNCTION__ = "client_handle_input" > #20 0x0000000000415a52 in client_input (client=0x14db2b00) at > imap-client.c:933 > cmd = <value optimized out> > output = 0x14db0c00 > bytes = 19 > __FUNCTION__ = "client_input" > #21 0x00002adf551a1676 in io_loop_call_io (io=0x14db0cd0) at ioloop.c:388 > ioloop = 0x14d98740 > t_id = 2 > #22 0x00002adf551a2c7d in io_loop_handler_run (ioloop=<value optimized out>) > at ioloop-epoll.c:220 > ctx = <value optimized out> > event = 0x14d9a250 > list = 0x14db0d20 > io = 0x1fdd > tv = {tv_sec = 3, tv_usec = 790044} > events_count = <value optimized out> > msecs = <value optimized out> > ret = 1 > i = 0 > call = false > __FUNCTION__ = "io_loop_handler_run" > #23 0x00002adf551a15f8 in io_loop_run (ioloop=0x14d98740) at ioloop.c:412 > __FUNCTION__ = "io_loop_run" > #24 0x00002adf551528d3 in master_service_run (service=0x14d985d0, > callback=0x1fdd) at master-service.c:566 > No locals. > #25 0x000000000041ef07 in main (argc=2, argv=0x14d98390) at main.c:400 > login_set = {auth_socket_path = 0x14d90060 "\210", > postlogin_socket_path = 0x14d900b8 "", postlogin_timeout_secs = 60, callback > = 0x41f100 <login_client_connected>, > failure_callback = 0x41f0a0 <login_client_failed>, > request_auth_token = 1} > service_flags = <value optimized out> > storage_service_flags = MAIL_STORAGE_SERVICE_FLAG_DISALLOW_ROOT > username = 0x0 > c = <value optimized out> > set_roots = {0x426b20, 0x0} > (gdb)
