Kind reminder :-)

D.

On Thu 13/02/2014 15:52, Dimos Alevizos wrote:
> At last, I was finally able to find time and run a backtrace, I've
> attached the file.
> If you don't remember (and hey, you can blame you after all this
> time) the problem is only with a couple (~10) of users among the
> hundreds of thousands currently using the service.
> 
> Please let me know if I should provide any other info, logs or whatever.
> 
> Dimos Alevizos
> 
> -------- Original Message --------
> Subject: Re: [Dovecot] Mbox corruption - Inbox beginning with
> 'FFrom' or 'FrFrom'
> From: Timo Sirainen <[email protected]>
> To: Dimos Alevizos <[email protected]>
> CC: Dimitris Paouris <[email protected]>, Dovecot Mailing List
> <[email protected]>
> Date: 02/11/2013 01:16 μμ
> 
> >Well, if that patch didn’t work, then the problem is elsewhere. There aren’t 
> >many other good possibilities left though.. How about adding this patch, it 
> >should be even safer than the previous one:
> >
> >http://hg.dovecot.org/dovecot-2.2/rev/d3062d066593
> >
> >On 30.10.2013, at 12.42, Dimos Alevizos <[email protected]> wrote:
> >
> >>I'm afraid it doesn't seem to be working.
> >>I've compiled a patched 2.2.6 dovecot with the patch you sent and installed 
> >>it in a production server (had to be 2.2.6 cause we've upgraded all the 
> >>rest since I begun this thread months ago) and although we still have mbox 
> >>corruptions (rarely as before) the server isn't crashing :
> >>
> >>Oct 30 11:15:19 pop04 dovecot: pop3-login: Login: user=<[email protected]>, 
> >>method=PLAIN, rip=85.72.232.35, lip=83.235.66.43, mpid=24419, secured, 
> >>session=<+0ywxfHpIQBVSOgj>
> >>Oct 30 11:15:20 pop04 dovecot: pop3([email protected]): Disconnected: 
> >>Logged out top=0/0, retr=0/0, del=0/1336, size=471029518
> >>Oct 30 11:19:12 pop04 dovecot: lmtp(2863, [email protected]): 
> >>r7U3KnyhcFIvCwAAckDtvw: 
> >>msgid=<[email protected]>: 
> >>size=17823 saved mail to INBOX
> >>Oct 30 11:33:12 pop04 dovecot: pop3-login: Login: user=<[email protected]>, 
> >>method=PLAIN, rip=85.72.224.94, lip=83.235.66.43, mpid=600, secured, 
> >>session=<vT2aBfLpxQBVSOBe>
> >>Oct 30 11:33:12 pop04 dovecot: pop3([email protected]): Error: Syncing 
> >>INBOX failed: Mailbox isn't a valid mbox file
> >>Oct 30 11:33:12 pop04 dovecot: pop3([email protected]): Error: Couldn't 
> >>init INBOX: Mailbox isn't a valid mbox file
> >>Oct 30 11:33:12 pop04 dovecot: pop3([email protected]): Mailbox init failed 
> >>top=0/0, retr=0/0, del=0/0, size=0
> >>Oct 30 11:33:33 pop04 dovecot: lmtp(16314, [email protected]): 
> >>Au4vIMqucFK6PwAAckDtvw: msgid=<[email protected]>: 
> >>size=7975817 save failed to INBOX: Mailbox isn't a valid mbox file
> >>
> >>Perhaps the patch is only valid for 2.1.16 and needs to be modified for 
> >>2.2.6 ?
> >>
> >>Thank you for your time
> >>Dimos Alevizos
> >>
> >>
> >>-------- Original Message --------
> >>Subject: Re: [Dovecot] Mbox corruption - Inbox beginning with 'FFrom' or 
> >>'FrFrom'
> >>From: Timo Sirainen <[email protected]>
> >>To: Dimos Alevizos <[email protected]>
> >>CC: [email protected], Dimitris Paouris <[email protected]>
> >>Date: 26/06/2013 06:59 μμ
> >>
> >>>It crashes one specific IMAP/POP3 session, so others are unaffected. The 
> >>>potential problems:
> >>>
> >>>  * It might cause the user's mbox to become crashing constantly. so first 
> >>> crash -> client reconnects -> client attempts the same operation -> crash 
> >>> again. Then again, this might not happen, it depends.
> >>>
> >>>  * The mbox file would probably become slightly more corrupted than 
> >>> normally, because it doesn't finish moving data around. No data should 
> >>> get actually lost, but some parts could become duplicated (e.g. some 
> >>> headers or even mails, possibly causing UID renumbering = redownloading).
> >>>
> >>>So not ideal in production, but shouldn't be too bad either, especially if 
> >>>you just wait for the first crash and then immediately switch to the old 
> >>>unpatched version.
> >>>
> >>>On 26.6.2013, at 10.21, Dimos Alevizos <[email protected]> wrote:
> >>>
> >>>>Hi,
> >>>>
> >>>>I haven't had the time to compile it yet, but a question just occurred.
> >>>>Given that it's so rare and we can't reproduce it on a dev server, how 
> >>>>safe is this to use on a production server ?
> >>>>When you say "crash" you mean the whole dovecot server or that specific 
> >>>>client's child ?
> >>>>
> >>>>D.
> >>>>
> >>>>-------- Original Message --------
> >>>>Subject: Re: [Dovecot] Mbox corruption - Inbox beginning with 'FFrom' or 
> >>>>'FrFrom'
> >>>>From: dalevizo <[email protected]>
> >>>>To: Timo Sirainen <[email protected]>
> >>>>CC: [email protected], Dimitris Paouris <[email protected]>
> >>>>Date: 24/06/2013 01:41 μμ
> >>>>
> >>>>>Thanx I'll try the patch as soon as possible and I'll let you know.
> >>>>>It is indeed very rare. We're only seeing 4-5 corruptions in about 13
> >>>>>million logins per day.
> >>>>>I've been trying to convince our design team that we should move to
> >>>>>maildir, but the truth is that it's quite a change, and we're way too
> >>>>>busy to deal with everything else AND a migration from mbox to maildir.
> >>>>>
> >>>>>D.
> >>>>>
> >>>>>On Mon 24/06/2013 13:16, Timo Sirainen wrote:
> >>>>>>On 19.6.2013, at 16.00, Dimos Alevizos <[email protected]> wrote:
> >>>>>>
> >>>>>>>we're having some problems with our dovecot setup.
> >>>>>>>I've seen similar problems in the mailing list some years ago but alas 
> >>>>>>>wasn't able to find a solution.
> >>>>>>>
> >>>>>>>Our setup is as follows :
> >>>>>>>An MX farm (postfix) sends mails via LMTP to a director farm (dovecot 
> >>>>>>>2.1.12) which proxies pop3/imap/lmtp traffic to a dovecot farm 
> >>>>>>>(dovecot 2.1.16).
> >>>>>>>All mailboxes and indexes are on NFS and all servers are Centos.
> >>>>>>>
> >>>>>>>The problem is that at times we see mailboxes (all of them are in mbox 
> >>>>>>>format) beginning with FFrom or FrFrom and of course dovecot says it's 
> >>>>>>>not a valid mbox file.
> >>>>>>
> >>>>>>This is quite an old bug, but it happens rarely enough that I haven't 
> >>>>>>been able to reproduce and fix it. Actually people hadn't complained 
> >>>>>>about it for a long time now, so I had assumed it had somehow gotten 
> >>>>>>fixed already.
> >>>>>>
> >>>>>>With the attached debug patch it should crash instead of (completely) 
> >>>>>>corrupting the mbox file. Debugging the resulting core file with gdb 
> >>>>>>could be useful in figuring this out.
> >>>>>>
> >>>>>>Although I wouldn't recommend mbox format for any big installation 
> >>>>>>anyway..
> >>>>>>
> >>>>>
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >

> Feb 13 15:41:08 pop06 dovecot: imap-login: Login: user=<someuser@somedomain>, 
> method=PLAIN, rip=195.170.0.170, lip=83.235.66.45, mpid=8157, 
> session=<E/1110nyQADDqgCq>
> Feb 13 15:41:09 pop06 dovecot: imap(someuser@somedomain): Panic: file 
> mbox-sync.c: line 655 (mbox_sync_handle_header): assertion failed: 
> (mail_ctx->mail.from_offset + move_diff != 1 && mail_ctx->mail.from_offset + 
> move_diff != 2)
> Feb 13 15:41:09 pop06 dovecot: imap(someuser@somedomain): Error: Raw 
> backtrace: /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf551923fa] -> 
> /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf55192466] -> 
> /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf55191583] -> 
> /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0 [0x2adf54e7e9a7] -> 
> /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0(mbox_sync+0x620) 
> [0x2adf54e7f160] -> 
> /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0(mbox_storage_sync_init+0x83)
>  [0x2adf54e80093] -> 
> /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0(mailbox_sync_init+0x31)
>  [0x2adf54e99da1] -> 
> /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0(mailbox_sync+0x27) 
> [0x2adf54e9abc7] -> dovecot/imap [someuser@somedomain 195.170.0.170 
> SELECT](cmd_select_full+0x219) [0x4111c9] -> dovecot/imap 
> [someuser@somedomain 195.170.0.170 SELECT](cmd_select+0xb) [0x411afb] -> 
> dovecot/imap [someuser@somedomain 195.170.0.170 SELECT](command_exec+0x37) 
> [0x4165e7] -> dovecot/imap [d
> Feb 13 15:41:09 pop06 dovecot: imap(someuser@somedomain): Fatal: master: 
> service(imap): child 8157 killed with signal 6 (core dumped)
> 
> root@pop06:someuser $ gdb /opt/dovecot-2.2.9/libexec/imap core.8157 
> GNU gdb (GDB) CentOS (7.0.1-45.el5.centos)
> Copyright (C) 2009 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /opt/dovecot-2.2.9/libexec/imap...done.
> [New Thread 8157]
> Reading symbols from 
> /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0...done.
> Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/libdovecot-storage.so.0
> Reading symbols from /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0...done.
> Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0
> Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
> Loaded symbols for /lib64/libc.so.6
> Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done.
> Loaded symbols for /lib64/librt.so.1
> Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
> Loaded symbols for /lib64/libdl.so.2
> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols 
> found)...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> Reading symbols from /lib64/libpthread.so.0...(no debugging symbols 
> found)...done.
> [Thread debugging using libthread_db enabled]
> Loaded symbols for /lib64/libpthread.so.0
> Reading symbols from 
> /opt/dovecot-2.2.9/lib/dovecot/lib10_quota_plugin.so...done.
> Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib10_quota_plugin.so
> Reading symbols from 
> /opt/dovecot-2.2.9/lib/dovecot/lib11_imap_quota_plugin.so...done.
> Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib11_imap_quota_plugin.so
> Reading symbols from 
> /opt/dovecot-2.2.9/lib/dovecot/lib15_notify_plugin.so...done.
> Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib15_notify_plugin.so
> Reading symbols from 
> /opt/dovecot-2.2.9/lib/dovecot/lib20_fts_plugin.so...done.
> Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib20_fts_plugin.so
> Reading symbols from 
> /opt/dovecot-2.2.9/lib/dovecot/lib21_fts_solr_plugin.so...done.
> Loaded symbols for /opt/dovecot-2.2.9/lib/dovecot/lib21_fts_solr_plugin.so
> Reading symbols from /lib64/libexpat.so.0...(no debugging symbols 
> found)...done.
> Loaded symbols for /lib64/libexpat.so.0
> Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols 
> found)...done.
> Loaded symbols for /lib64/libgcc_s.so.1
> 
> warning: no loadable sections found in added symbol-file system-supplied DSO 
> at 0x7fff49dfd000
> Core was generated by `dovecot/imap [someuser@somed'.
> Program terminated with signal 6, Aborted.
> #0  0x0000003efe030265 in raise () from /lib64/libc.so.6
> (gdb) bt full
> #0  0x0000003efe030265 in raise () from /lib64/libc.so.6
> No symbol table info available.
> #1  0x0000003efe031d10 in abort () from /lib64/libc.so.6
> No symbol table info available.
> #2  0x00002adf5519241e in default_fatal_finish (type=<value optimized out>, 
> status=0) at failures.c:192
>         backtrace = 0x14d90370 
> "/opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf551923fa] -> 
> /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf55192466] -> 
> /opt/dovecot-2.2.9/lib/dovecot/libdovecot.so.0 [0x2adf55191583] ->"...
> #3  0x00002adf55192466 in i_internal_fatal_handler (ctx=0x7fff49d91340, 
> format=<value optimized out>, args=<value optimized out>) at failures.c:653
>         status = 0
> #4  0x00002adf55191583 in i_panic (format=0x1fdd <Address 0x1fdd out of 
> bounds>) at failures.c:264
>         ctx = {type = LOG_TYPE_PANIC, exit_status = 0, timestamp = 0x0}
>         args = {{gp_offset = 40, fp_offset = 48, overflow_arg_area = 
> 0x7fff49d91410, reg_save_area = 0x7fff49d91350}}
> #5  0x00002adf54e7e9a7 in mbox_sync_handle_header (sync_ctx=0x7fff49d935e0, 
> mail_ctx=0x7fff49d937d0, partial=true) at mbox-sync.c:654
>         orig_from_offset = 14200194
>         move_diff = -14200194
>         ret = <value optimized out>
>         __FUNCTION__ = "mbox_sync_handle_header"
> #6  mbox_sync_loop (sync_ctx=0x7fff49d935e0, mail_ctx=0x7fff49d937d0, 
> partial=true) at mbox-sync.c:1138
>         rec = 0x14dc0d20
>         uid = <value optimized out>
>         messages_count = 302
>         offset = 14200194
>         ret = <value optimized out>
>         expunged = false
>         skipped_mails = false
>         uids_broken = false
> #7  0x00002adf54e7f160 in mbox_sync_int (mbox=0x14db5b30, 
> flags=MBOX_SYNC_UNDIRTY) at mbox-sync.c:1619
>         index_sync_ctx = 0x14dd3710
>         sync_view = 0x14dd3780
>         ret = <value optimized out>
>         delay_writes = true
>         readonly = false
>         trans = 0x14dbf420
>         sync_ctx = {mbox = 0x14db5b30, flags = MBOX_SYNC_UNDIRTY, input = 
> 0x14dd4fe8, file_input = 0x14dd4e18, write_fd = 15, orig_mtime = 1392290551, 
> orig_atime = 1391079897, 
>           orig_size = 29081201, last_stat = {st_dev = 25, st_ino = 30254297, 
> st_nlink = 1, st_mode = 33152, st_uid = 1904649, st_gid = 550, pad0 = 0, 
> st_rdev = 0, st_size = 29081201, 
>             st_blksize = 32768, st_blocks = 56920, st_atim = {tv_sec = 
> 1391079897, tv_nsec = 0}, st_mtim = {tv_sec = 1392290551, tv_nsec = 0}, 
> st_ctim = {tv_sec = 1392290551, 
>               tv_nsec = 961280000}, __unused = {0, 0, 0}}, index_sync_ctx = 
> 0x14dd3710, sync_view = 0x14dd3780, t = 0x14dbf420, reset_hdr = 
> {major_version = 0 '\000', minor_version = 0 '\000', 
>             base_header_size = 0, header_size = 0, record_size = 0, 
> compat_flags = 0 '\000', unused = "\000\000", indexid = 0, flags = 0, 
> uid_validity = 0, next_uid = 0, messages_count = 0, 
>             unused_old_recent_messages_count = 0, seen_messages_count = 0, 
> deleted_messages_count = 0, first_recent_uid = 0, first_unseen_uid_lowwater = 
> 0, first_deleted_uid_lowwater = 0, 
>             log_file_seq = 0, log_file_tail_offset = 0, log_file_head_offset 
> = 0, unused_old_sync_size = 0, unused_old_sync_stamp = 0, day_stamp = 0, 
> day_first_uid = {0, 0, 0, 0, 0, 0, 0, 0}}, 
>           hdr = 0x14dbf090, header = 0x14dcb070, from_line = 0x14dcd100, 
> base_uid_validity = 1387910574, base_uid_last = 439, base_uid_last_offset = 
> 0, mails = {arr = {buffer = 0x14dcb0b0, 
>               element_size = 64}, v = 0x14dcb0b0, v_modifiable = 0x14dcb0b0}, 
> sync_changes = 0x14dcb0f0, mail_keyword_pool = 0x14dbf780, 
> saved_keywords_pool = 0x14de7480, prev_msg_uid = 242, 
>           next_uid = 440, idx_next_uid = 440, seq = 105, idx_seq = 105, 
> need_space_seq = 0, last_nonrecent_uid = 0, expunged_space = 14200194, 
> space_diff = 0, dest_first_mail = 1, 
>           first_mail_crlf_expunged = 0, keep_recent = 0, readonly = 0, 
> delay_writes = 1, renumber_uids = 0, moved_offsets = 0, ext_modified = 0, 
> index_reset = 0, errors = 0}
>         sync_flags = <value optimized out>
>         changed = <value optimized out>
>         __FUNCTION__ = "mbox_sync_int"
> #8  mbox_sync (mbox=0x14db5b30, flags=MBOX_SYNC_UNDIRTY) at mbox-sync.c:2000
>         lock_id = 3
>         ret = <value optimized out>
>         __FUNCTION__ = "mbox_sync"
> #9  0x00002adf54e80093 in mbox_storage_sync_init (box=0x14db5b30, flags=65) 
> at mbox-sync.c:2049
>         mbox = 0x1fdd
>         mbox_sync_flags = 8157
>         ret = 0
> #10 0x00002adf54e99da1 in mailbox_sync_init (box=0x14db5b30, flags=65) at 
> mail-storage.c:1668
>         _data_stack_cur_id = 4
>         ctx = <value optimized out>
> #11 0x00002adf54e9abc7 in mailbox_sync (box=0x1fdd, flags=8157) at 
> mail-storage.c:1716
>         ctx = 0x14d99110
>         status = {sync_delayed_expunges = 0}
> #12 0x00000000004111c9 in select_open (cmd=0x14db3310, readonly=false) at 
> cmd-select.c:313
> No locals.
> #13 cmd_select_full (cmd=0x14db3310, readonly=false) at cmd-select.c:439
>         client = 0x14db2b00
>         ctx = 0x14db3408
>         args = 0x14d9ba98
>         list_args = 0x101010101010101
>         mailbox = 0x14d902b8 "INBOX"
>         error = <value optimized out>
>         ret = <value optimized out>
>         __FUNCTION__ = "cmd_select_full"
> ---Type <return> to continue, or q <return> to quit---
> #14 0x0000000000411afb in cmd_select (cmd=0x1fdd) at cmd-select.c:448
> No locals.
> #15 0x00000000004165e7 in command_exec (cmd=0x14db3310) at imap-commands.c:158
>         hook = 0x14d99110
>         ret = <value optimized out>
> #16 0x00000000004153db in client_command_input (cmd=0x1fdd) at 
> imap-client.c:780
>         client = 0x14db2b00
>         command = <value optimized out>
>         __FUNCTION__ = "client_command_input"
> #17 0x0000000000415495 in client_command_input (cmd=0x14db3310) at 
> imap-client.c:841
>         client = 0x14db2b00
>         command = 0x6
>         __FUNCTION__ = "client_command_input"
> #18 0x000000000041564f in client_handle_next_command (client=0x14db2b00) at 
> imap-client.c:879
> No locals.
> #19 client_handle_input (client=0x14db2b00) at imap-client.c:891
>         _data_stack_cur_id = 3
>         remove_io = <value optimized out>
>         handled_commands = false
>         __FUNCTION__ = "client_handle_input"
> #20 0x0000000000415a52 in client_input (client=0x14db2b00) at 
> imap-client.c:933
>         cmd = <value optimized out>
>         output = 0x14db0c00
>         bytes = 19
>         __FUNCTION__ = "client_input"
> #21 0x00002adf551a1676 in io_loop_call_io (io=0x14db0cd0) at ioloop.c:388
>         ioloop = 0x14d98740
>         t_id = 2
> #22 0x00002adf551a2c7d in io_loop_handler_run (ioloop=<value optimized out>) 
> at ioloop-epoll.c:220
>         ctx = <value optimized out>
>         event = 0x14d9a250
>         list = 0x14db0d20
>         io = 0x1fdd
>         tv = {tv_sec = 3, tv_usec = 790044}
>         events_count = <value optimized out>
>         msecs = <value optimized out>
>         ret = 1
>         i = 0
>         call = false
>         __FUNCTION__ = "io_loop_handler_run"
> #23 0x00002adf551a15f8 in io_loop_run (ioloop=0x14d98740) at ioloop.c:412
>         __FUNCTION__ = "io_loop_run"
> #24 0x00002adf551528d3 in master_service_run (service=0x14d985d0, 
> callback=0x1fdd) at master-service.c:566
> No locals.
> #25 0x000000000041ef07 in main (argc=2, argv=0x14d98390) at main.c:400
>         login_set = {auth_socket_path = 0x14d90060 "\210", 
> postlogin_socket_path = 0x14d900b8 "", postlogin_timeout_secs = 60, callback 
> = 0x41f100 <login_client_connected>, 
>           failure_callback = 0x41f0a0 <login_client_failed>, 
> request_auth_token = 1}
>         service_flags = <value optimized out>
>         storage_service_flags = MAIL_STORAGE_SERVICE_FLAG_DISALLOW_ROOT
>         username = 0x0
>         c = <value optimized out>
>         set_roots = {0x426b20, 0x0}
> (gdb) 

Reply via email to