https://bugs.openldap.org/show_bug.cgi?id=10114
Issue ID: 10114
Summary: Crash in mdb_copy with stale transactions(?)
Product: LMDB
Version: 0.9.30
Hardware: x86_64
OS: Linux
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: liblmdb
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: ---
I have a LMDB database which is damaged in some way, I'm not sure exactly how,
but the application that created it (KDE baloo_file) crashes on startup while
trying to read it, with a backtrace pointing inside liblmdb...
#0 __memcpy_avx_unaligned_erms () at
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:837
#1 0x00007fbf1fa110b6 in mdb_page_touch (mc=mc@entry=0x7ffe8dc1adf0) at
mdb.c:2502
#2 0x00007fbf1fa12c9c in mdb_cursor_touch (mc=mc@entry=0x7ffe8dc1adf0) at
mdb.c:6563
#3 0x00007fbf1fa16228 in mdb_cursor_put (mc=mc@entry=0x7ffe8dc1adf0,
key=key@entry=0x7ffe8dc1b1e0, data=data@entry=0x7ffe8dc1b1f0, flags=<optimized
out>, flags@entry=0) at mdb.c:6697
#4 0x00007fbf1fa18d51 in mdb_put (txn=0x55986d167a70, dbi=<optimized out>,
key=0x7ffe8dc1b1e0, data=0x7ffe8dc1b1f0, flags=0) at mdb.c:9076
#5 0x00007fbf1fcec44b in Baloo::PostingDB::put (this=this@entry=0x7ffe8dc1b2d0,
term=..., list=...) at
/usr/src/debug/kde-frameworks/baloo-5.110.0/baloo-5.110.0/src/engine/postingdb.cpp:66
If I try to mdb_dump the database (with nothing else trying to access it) I get
mdb_dump: index: MDB_BAD_TXN: Transaction must abort, has a child, or is
invalid
That sounds like the sort of thing that ought to be cleared by mdb_copy -c, but
instead that command also crashes inside __memcpy_avx_unaligned_erms.
Backtrace:
#0 __memcpy_avx_unaligned_erms () at
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:839
#1 0x0000555555557e67 in mdb_env_cwalk (my=my@entry=0x7fffffffdbc0,
pg=pg@entry=0x7fffffffd988, flags=0) at mdb.c:9264
#2 0x0000555555557fdf in mdb_env_cwalk (my=my@entry=0x7fffffffdbc0,
pg=pg@entry=0x7fffffffdb90, flags=flags@entry=0) at mdb.c:9306
#3 0x0000555555558523 in mdb_env_copyfd1 (env=0x55555556a2a0, fd=<optimized
out>) at mdb.c:9469
#4 0x00005555555588c9 in mdb_env_copy2 (env=0x55555556a2a0, path=<optimized
out>, flags=flags@entry=1) at mdb.c:9623
#5 0x0000555555558ea6 in main (argc=3, argv=0x7fffffffe008) at mdb_copy.c:74
I tried to poke at the offending data structure a little but I didn't
immediately see what was wrong...
(gdb) frame 1
#1 0x0000555555557e67 in mdb_env_cwalk (my=my@entry=0x7fffffffdbc0,
pg=pg@entry=0x7fffffffd988, flags=0) at mdb.c:9264
9264
mdb_page_copy(leaf, mp, my->mc_env->me_psize);
(gdb) p mp
$1 = (MDB_page *) 0x7fc008d32000
(gdb) p *mp
$2 = {mp_p = {p_pgno = 0x0606060606060606, p_next = 0x0606060606060606}, mp_pad
= 1542, mp_flags = 1542, mp_pb = {pb = {pb_lower = 1542, pb_upper = 18832},
pb_pages = 1234175494}, mp_ptrs = 0x7fc008d32010}
... except that those values for p_pgno and p_next don't look terribly
plausible to me.
The database file is, unfortunately, much too large to attach here (2.3G
uncompressed, 383M compressed with xz -17) and also it's, well, a full-text
index of everything I have on my computer, so I'd be hesitant to attach it even
if it fit. I can make it available for private download if that would be
helpful. I'm also happy to do other experiments.
I realize that crashes caused by database corruption can be very difficult to
avoid but I hope there might be some kind of easy defensive measure to take in
this particular case which could at least allow the application to fail cleanly
rather than crashing.
--
You are receiving this mail because:
You are on the CC list for the issue.