On 21.12.2018 15.01, Tim Mohlmann via dovecot wrote: > Hi, I now this is kinda impolite to ask, but I would like some feedback > on this issue. I'm representing the Mailu user group here, plus the fact > that my own production server is affected and starts to crash due to > file corruption as a result of the indexer-worker segfaults. Last night > my server became unavailable with logs like: > > Error: imap([email protected])<131><2RGyoYd9Jq/AqMsO>: Corrupted > transaction log file /mail/[email protected]/dovecot.index.log seq 6: > record size too small (type=0x2272560, offset=26340, size=0) > (sync_offset=26592) > > Deleting the corrupted files brought it back, but I don't really like > the prospect of cleaning corrupted files during christmas. ;) > > I would like to know: do you consider this a real dovecot bug? Or is > this something as a result of the way Alpine configures the package? > Should I cross post the report to Alpine? Or is it worth changing the > image to Debian instead? > > Or maybe you don't consider the trace good enough to work on? Also that > I would like to know. > > Many thanks in advance! > > Tim > > On 12/20/18 9:49 AM, Tim Mohlmann via dovecot wrote: >> I'm really sorry, but it was a long day yesterday. I forgot to mention >> that I used alpine:edge (based on the alpine master branch) image to set >> up the build environment. So there were some version changes. >> >> dovecot --version: 2.3.3 (dcead646b) >> >> apk info musl: >> >> musl-1.1.20-r2 description: >> the musl c library (libc) implementation >> >> Configuration was unchanged wrt to my first mail. Maybe it's helpful to >> share the build flags? >> >> export CFLAGS="-O0" >> export CXXFLAGS="$CFLAGS" >> >> export CPPFLAGS="$CFLAGS" >> >> export LDFLAGS="-Wl,--as-needed" >> >> In the APKBUILD file can be found here: >> https://git.alpinelinux.org/cgit/aports/tree/main/dovecot/APKBUILD and >> I've modified the following: >> >> options="libtool !strip" >> >> subpackages=" >> >> .... >> >> $pkgname-dbg >> >> " >> >> And if it is of any interest, I've done a `docker commit` on the >> container and pushed the resulting image as muhlemmer/dovecot:debug. >> https://cloud.docker.com/repository/docker/muhlemmer/dovecot/tags. Don't >> know if its of much use by itself, without the Mailu ecosystem around. >> Just in case. >> >> Cheers! >> >> On 12/20/18 2:04 AM, Tim Mohlmann via dovecot wrote: >>> I've managed to re-compile the alpine packages inside the container, >>> creating the debug symbols and disabling compile time optimizations, as >>> they also disturbed the backtrace. The following output looks much more >>> meaningful. But unfortunately there are still some symbols missing for >>> lucene-core. Let me know if this is enough to work with. (It doesn't >>> seem to be musl after all?) >>> >>> Reading symbols from /usr/libexec/dovecot/indexer-worker...Reading >>> symbols from >>> /usr/lib/debug//usr/libexec/dovecot/indexer-worker.debug...done. >>> done. >>> [New LWP 30417] >>> >>> warning: Can't read pathname for load map: No error information. >>> Core was generated by `dovecot/indexer-worker'. >>> Program terminated with signal SIGSEGV, Segmentation fault. >>> #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1 >>> (gdb) bt full >>> #0 0x00007f1653f12c31 in ?? () from /usr/lib/libclucene-core.so.1 >>> No symbol table info available. >>> #1 0x00007f1653edeba2 in >>> lucene::index::KeepOnlyLastCommitDeletionPolicy::onCommit(std::vector<lucene::index::IndexCommitPoint*, >>> std::allocator<lucene::index::IndexCommitPoint*> >&) () >>> from /usr/lib/libclucene-core.so.1 >>> No symbol table info available. >>> #2 0x00007f1653f15312 in ?? () from /usr/lib/libclucene-core.so.1 >>> No symbol table info available. >>> #3 0x00007f1653f325e3 in lucene::index::IndexWriter::doFlush(bool) () >>> from /usr/lib/libclucene-core.so.1 >>> No symbol table info available. >>> #4 0x00007f1653f32f9c in lucene::index::IndexWriter::flush(bool, bool) >>> () from /usr/lib/libclucene-core.so.1 >>> No symbol table info available. >>> #5 0x00007f1653f33052 in >>> lucene::index::IndexWriter::closeInternal(bool) () from >>> /usr/lib/libclucene-core.so.1 >>> No symbol table info available. >>> #6 0x00007f1653f337c4 in lucene::index::IndexWriter::close(bool) () >>> from /usr/lib/libclucene-core.so.1 >>> No symbol table info available. >>> #7 0x00007f16541fc8c6 in lucene_index_build_deinit >>> (index=0x55945fe7a860) at lucene-wrapper.cc:646 >>> ret = 0 >>> #8 0x00007f16541f9ba4 in fts_backend_lucene_update_deinit >>> (_ctx=0x55945fe910a0) at fts-backend-lucene.c:259 >>> ctx = 0x55945fe910a0 >>> backend = 0x55945fe7a760 >>> ret = 0 >>> __func__ = "fts_backend_lucene_update_deinit" >>> #9 0x00007f1654649baa in fts_backend_update_deinit >>> (_ctx=0x55945fe7a480) at fts-api.c:144 >>> ctx = 0x55945fe910a0 >>> backend = 0x55945fe7a760 >>> ret = 0 >>> #10 0x00007f1654655d83 in fts_transaction_end (t=0x55945fe88820, >>> error_r=0x7fff1b28d1b8) at fts-storage.c:613 >>> ft = 0x55945fe84400 >>> flist = 0x55945fe7a3a0 >>> ret = 0 >>> __func__ = "fts_transaction_end" >>> #11 0x00007f16546562e8 in fts_transaction_commit (t=0x55945fe88820, >>> changes_r=0x7fff1b28d2c0) at fts-storage.c:696 >>> ft = 0x55945fe84400 >>> --Type <RET> for more, q to quit, c to continue without paging--c >>> fbox = 0x55945fe7f798 >>> box = 0x55945fe7efa8 >>> autoindex = false >>> ret = 0 >>> error = 0x7f1654813665 <p_strdup_vprintf+90> >>> "H\213E\350\353\067\213E\344\211\302H\213E\330H\211\326H\211\307\350\301\367\377\377H\211E\360\213E\344\203\350\001\211\302H\213M\350H\213E\360H\211\316H\211\307\350\302\357\354\377H\213E\360H\213M\370dH3\f%(" >>> #12 0x00007f165469b83e in quota_mailbox_transaction_commit >>> (ctx=0x55945fe88820, changes_r=0x7fff1b28d2c0) at quota-storage.c:174 >>> qbox = 0x55945fe7f600 >>> qt = 0x55945fe8f160 >>> __func__ = "quota_mailbox_transaction_commit" >>> #13 0x00007f16548fefad in mailbox_transaction_commit_get_changes >>> (_t=0x7fff1b28d348, changes_r=0x7fff1b28d2c0) at mail-storage.c:2215 >>> _data_stack_cur_id = 4 >>> t = 0x55945fe88820 >>> box = 0x55945fe7efa8 >>> save_count = 0 >>> ret = 21908 >>> __func__ = "mailbox_transaction_commit_get_changes" >>> #14 0x00007f16548feefc in mailbox_transaction_commit (t=0x7fff1b28d348) >>> at mail-storage.c:2197 >>> changes = {pool = 0x0, uid_validity = 1418718114, saved_uids = >>> {arr = {buffer = 0x7f16548e5d9e <__x86_return_thunk+5>, element_size = >>> 140733649048408}, v = 0x7f16548e5d9e <__x86_return_thunk+5>, >>> v_modifiable = 0x7f16548e5d9e <__x86_return_thunk+5>}, >>> ignored_modseq_changes = 455660384, changes_mask = (unknown: 0), >>> no_read_perm = false} >>> ret = 21908 >>> #15 0x000055945e4abdcb in index_mailbox_precache (conn=0x55945fe57440, >>> box=0x55945fe7efa8) at master-connection.c:126 >>> storage = 0x55945fe77f28 >>> username = 0x55945fe73ee0 "[email protected]" >>> box_vname = 0x55945fe7f310 "Sent" >>> errstr = 0x7f165469cb53 <quota_mailbox_sync_deinit+164> >>> "\213E\344H\203\304([]\351\320f\377\377UH\211\345H\203\354\060H\211}\330dH\213\004%(" >>> error = MAIL_ERROR_TEMP >>> status = {messages = 10, recent = 0, unseen = 0, uidvalidity = >>> 1545140227, uidnext = 11, first_unseen_seq = 0, first_recent_uid = 3, >>> last_cached_seq = 1, highest_modseq = 0, highest_pvt_modseq = 0, >>> keywords = 0x0, permanent_flags = 0, flags = 0, permanent_keywords = >>> false, allow_new_keywords = false, nonpermanent_modseqs = false, >>> no_modseq_tracking = false, have_guids = true, have_save_guids = true, >>> have_only_guid128 = false} >>> trans = 0x0 >>> search_args = 0x0 >>> ctx = 0x0 >>> mail = 0x0 >>> metadata = {guid = '\000' <repeats 15 times>, virtual_size = 0, >>> physical_size = 0, first_save_date = 0, cache_fields = 0x0, >>> precache_fields = (MAIL_FETCH_STREAM_HEADER | MAIL_FETCH_STREAM_BODY | >>> MAIL_FETCH_RECEIVED_DATE | MAIL_FETCH_SAVE_DATE | >>> MAIL_FETCH_PHYSICAL_SIZE | MAIL_FETCH_VIRTUAL_SIZE | >>> MAIL_FETCH_UIDL_BACKEND | MAIL_FETCH_GUID | MAIL_FETCH_POP3_ORDER), >>> backend_ns_prefix = 0x0, backend_ns_type = (unknown: 0)} >>> seq = 2 >>> first_uid = 2 >>> last_uid = 10 >>> percentage_str = "\026\177\000" >>> counter = 9 >>> max = 9 >>> percentage = 32534 >>> percentage_sent = 0 >>> ret = 0 >>> uids = 0x55945fe48938 " (UIDs 2..10)" >>> #16 0x000055945e4ac1c1 in index_mailbox (conn=0x55945fe57440, >>> user=0x55945fe73dc8, mailbox=0x55945fe46058 "Sent", max_recent_msgs=0, >>> what=0x55945fe46070 "i") at master-connection.c:205 >>> ns = 0x55945fe77da0 >>> box = 0x55945fe7efa8 >>> status = {messages = 1608873868, recent = 0, unseen = >>> 1581972288, uidvalidity = 21908, uidnext = 455660880, first_unseen_seq = >>> 32767, first_recent_uid = 1608810272, last_cached_seq = 21908, >>> highest_modseq = 139733882519610, highest_pvt_modseq = 94095725476155, >>> keywords = 0x7fff1b28d540, permanent_flags = 0, flags = 0, >>> permanent_keywords = false, allow_new_keywords = false, >>> nonpermanent_modseqs = false, no_modseq_tracking = true, have_guids = >>> true, have_save_guids = false, have_only_guid128 = true} >>> path = 0x55945fe47f38 "/mail/[email protected]/.Sent" >>> errstr = 0x7f16546e403a <__x86_return_thunk+5> >>> "\363\220\017\256\350\353\371H\215d$\b\303\350\a" >>> error = 32534 >>> sync_flags = MAILBOX_SYNC_FLAG_FULL_READ >>> ret = 0 >>> #17 0x000055945e4ac3d8 in master_connection_input_line >>> (conn=0x55945fe57440, line=0x55945fe5778c >>> "[email protected]\tSent\t2HvXT2h9epPAqMsG\t0\ti") at master-connection.c:247 >>> args = 0x55945fe46078 >>> input = {parent_event = 0x0, module = 0x55945e4ad2ec "mail", >>> service = 0x55945e4ad2f1 "indexer-worker", username = 0x55945fe46048 >>> "[email protected]", session_id = 0x0, session_id_prefix = 0x55945fe4605d >>> "2HvXT2h9epPAqMsG", session_create_time = 0, local_ip = {family = 0, u = >>> {ip6 = {__in6_union = {__s6_addr = '\000' <repeats 15 times>, >>> __s6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, >>> ip4 = {s_addr = 0}}}, remote_ip = {family = 0, u = {ip6 = {__in6_union = >>> {__s6_addr = '\000' <repeats 15 times>, __s6_addr16 = {0, 0, 0, 0, 0, 0, >>> 0, 0}, __s6_addr32 = {0, 0, 0, 0}}}, ip4 = {s_addr = 0}}}, local_port = >>> 0, remote_port = 0, userdb_fields = 0x0, flags_override_add = (unknown: >>> 0), flags_override_remove = (unknown: 0), no_userdb_lookup = false, >>> debug = false, conn_secured = false, conn_ssl_secured = false} >>> service_user = 0x55945fe59958 >>> user = 0x55945fe73dc8 >>> str = 0x7f16547f6231 <pool_system_free+28> >>> "\220\311\351\375\335\356\377UH\211\345H\203\354 >>> H\211}\370H\211u\360H\211U\350H\211M\340H\203", <incomplete sequence \340> >>> error = 0x0 >>> max_recent_msgs = 0 >>> ret = 32767 >>> #18 0x000055945e4ac58f in master_connection_input (conn=0x55945fe57440) >>> at master-connection.c:287 >>> _data_stack_cur_id = 3 >>> line = 0x55945fe5778c >>> "[email protected]\tSent\t2HvXT2h9epPAqMsG\t0\ti" >>> ret = 32534 >>> #19 0x00007f16547e73c8 in io_loop_call_io (io=0x55945fe57480) at >>> ioloop.c:698 >>> ioloop = 0x55945e4afca0 >>> t_id = 2 >>> __func__ = "io_loop_call_io" >>> #20 0x00007f16547ea409 in io_loop_handler_run_internal >>> (ioloop=0x55945e4afca0) at ioloop-epoll.c:221 >>> ctx = 0x7f1654b0dda0 >>> events = 0x55945fe548c0 >>> event = 0x55945fe548c0 >>> list = 0x55945fe57500 >>> io = 0x55945fe57480 >>> tv = {tv_sec = 2147483, tv_usec = 0} >>> events_count = 5 >>> msecs = -1 >>> ret = 1 >>> i = 0 >>> j = 0 >>> call = true >>> __func__ = "io_loop_handler_run_internal" >>> #21 0x00007f16547e7657 in io_loop_handler_run (ioloop=0x55945e4afca0) at >>> ioloop.c:750 >>> __func__ = "io_loop_handler_run" >>> #22 0x00007f16547e752e in io_loop_run (ioloop=0x55945e4afca0) at >>> ioloop.c:723 >>> __func__ = "io_loop_run" >>> #23 0x00007f165471b3d5 in master_service_run (service=0x55945e4afb40, >>> callback=0x55945e4ab5ed <client_connected>) at master-service.c:775 >>> No locals. >>> #24 0x000055945e4ab7ce in main (argc=1, argv=0x55945e4af8e0) at >>> indexer-worker.c:78 >>> service_flags = (MASTER_SERVICE_FLAG_KEEP_CONFIG_OPEN | >>> MASTER_SERVICE_FLAG_SEND_STATS) >>> storage_service_flags = (MAIL_STORAGE_SERVICE_FLAG_USERDB_LOOKUP >>> | MAIL_STORAGE_SERVICE_FLAG_TEMP_PRIV_DROP | >>> MAIL_STORAGE_SERVICE_FLAG_NO_IDLE_TIMEOUT) >>> c = -1 >>> >>> >>> On 12/19/18 8:15 PM, Tim Mohlmann via dovecot wrote: >>> >>> <Deleted due to message size limit> >>> >>> This unfortunately looks a lot like lucene problem.
Aki
