[Issue 10114] New: Crash in mdb_copy with stale transactions(?)

openldap-its Mon, 16 Oct 2023 10:55:41 -0700

https://bugs.openldap.org/show_bug.cgi?id=10114


          Issue ID: 10114
           Summary: Crash in mdb_copy with stale transactions(?)
           Product: LMDB
           Version: 0.9.30
          Hardware: x86_64
                OS: Linux
            Status: UNCONFIRMED
          Keywords: needs_review
          Severity: normal
          Priority: ---
         Component: liblmdb
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

I have a LMDB database which is damaged in some way, I'm not sure exactly how,
but the application that created it (KDE baloo_file) crashes on startup while
trying to read it, with a backtrace pointing inside liblmdb...

#0  __memcpy_avx_unaligned_erms () at
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:837
#1  0x00007fbf1fa110b6 in mdb_page_touch (mc=mc@entry=0x7ffe8dc1adf0) at
mdb.c:2502
#2  0x00007fbf1fa12c9c in mdb_cursor_touch (mc=mc@entry=0x7ffe8dc1adf0) at
mdb.c:6563
#3  0x00007fbf1fa16228 in mdb_cursor_put (mc=mc@entry=0x7ffe8dc1adf0,
key=key@entry=0x7ffe8dc1b1e0, data=data@entry=0x7ffe8dc1b1f0, flags=<optimized
out>, flags@entry=0) at mdb.c:6697
#4 0x00007fbf1fa18d51 in mdb_put (txn=0x55986d167a70, dbi=<optimized out>,
key=0x7ffe8dc1b1e0, data=0x7ffe8dc1b1f0, flags=0) at mdb.c:9076
#5 0x00007fbf1fcec44b in Baloo::PostingDB::put (this=this@entry=0x7ffe8dc1b2d0,
term=..., list=...) at
/usr/src/debug/kde-frameworks/baloo-5.110.0/baloo-5.110.0/src/engine/postingdb.cpp:66

If I try to mdb_dump the database (with nothing else trying to access it) I get

mdb_dump: index: MDB_BAD_TXN: Transaction must abort, has a child, or is
invalid

That sounds like the sort of thing that ought to be cleared by mdb_copy -c, but
instead that command also crashes inside __memcpy_avx_unaligned_erms. 
Backtrace:

#0  __memcpy_avx_unaligned_erms () at
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:839
#1  0x0000555555557e67 in mdb_env_cwalk (my=my@entry=0x7fffffffdbc0,
pg=pg@entry=0x7fffffffd988, flags=0) at mdb.c:9264
#2  0x0000555555557fdf in mdb_env_cwalk (my=my@entry=0x7fffffffdbc0,
pg=pg@entry=0x7fffffffdb90, flags=flags@entry=0) at mdb.c:9306
#3  0x0000555555558523 in mdb_env_copyfd1 (env=0x55555556a2a0, fd=<optimized
out>) at mdb.c:9469
#4  0x00005555555588c9 in mdb_env_copy2 (env=0x55555556a2a0, path=<optimized
out>, flags=flags@entry=1) at mdb.c:9623
#5  0x0000555555558ea6 in main (argc=3, argv=0x7fffffffe008) at mdb_copy.c:74

I tried to poke at the offending data structure a little but I didn't
immediately see what was wrong...

(gdb) frame 1
#1  0x0000555555557e67 in mdb_env_cwalk (my=my@entry=0x7fffffffdbc0,
pg=pg@entry=0x7fffffffd988, flags=0) at mdb.c:9264
9264                                                           
mdb_page_copy(leaf, mp, my->mc_env->me_psize);

(gdb) p mp
$1 = (MDB_page *) 0x7fc008d32000
(gdb) p *mp
$2 = {mp_p = {p_pgno = 0x0606060606060606, p_next = 0x0606060606060606}, mp_pad
= 1542, mp_flags = 1542, mp_pb = {pb = {pb_lower = 1542, pb_upper = 18832},
pb_pages = 1234175494}, mp_ptrs = 0x7fc008d32010}

... except that those values for p_pgno and p_next don't look terribly
plausible to me.

The database file is, unfortunately, much too large to attach here (2.3G
uncompressed, 383M compressed with xz -17) and also it's, well, a full-text
index of everything I have on my computer, so I'd be hesitant to attach it even
if it fit.  I can make it available for private download if that would be
helpful.  I'm also happy to do other experiments.

I realize that crashes caused by database corruption can be very difficult to
avoid but I hope there might be some kind of easy defensive measure to take in
this particular case which could at least allow the application to fail cleanly
rather than crashing.

-- 
You are receiving this mail because:
You are on the CC list for the issue.

[Issue 10114] New: Crash in mdb_copy with stale transactions(?)

Reply via email to