On Mon, 2017-10-16 at 13:58 +0200, Hallvard Breien Furuseth wrote: > On 16. okt. 2017 12:51, Howard Chu wrote: > > timur.kris...@gmail.com wrote: > > > I have an app that uses LMDB, and I've experienced an interesting > > > issue: when trying to delete a certain item with mdb_cursor_del, > > > it > > > crashed with the following backtrace: https://pastebin.com/7p9wtk > > > j9 > > Weird backtrace. It says mdb_page_dirty(), which is small, streches > over 300+ lines (frames #3-#4). And mdb_page_alloc() alone has no > hex address for prefix. Maybe miscompilation, two liblmdb libraries > linked into the same executable, or something like that? Or some > wild pointer write or whatever messed things up.
Not sure what was going on there, maybe -O3 messed it up. Still, the issue does appear with -O0 too and here is a backtrace with -O0: https://pastebin.com/SfeMMEPH > > Most likely the dirty > > list is too big, which means you're trying to do too much in a > > single > > transaction. > > Shouldn't happen though. The txn should have failed earlier with > MDB_TXN_FULL. > > Which also shouldn't happen since LMDB should have spilled enough > pages to > make room - unless you have hundreds of cursors at modified pages so > LMDB can't spill enough. > > But we should probably test LMDB with impractically tight dirty-list > arrays > (i.e. a very small MDB_IDL_UM_MAX), so LMDB keeps running into such > cases. I've taken a look at the value of rc (see my reply to Howard), and it seems to me that Леонид Юрьев's assessment may be correct here. rc is -1 which indicates that the page (even though newly allocated, maybe a reused page?) is already on the txn's dirty pages list. - Timur