Ulrich Windl wrote: > > >> Martin Raiber <martin(a)urbackup.org> schrieb am > > 19.01.2021 um 19:26 in Nachricht > > <010201771be5823e-1014e8bb-0f79-4506-b27e-d08d25400b1e-000000(a)eu-west-1.amazonse > > .com>: > > On 27.10.2020 18:52 Howard Chu wrote: > > >>>> 3. LMDB causes crashes if database is corrupted > > >>> You can enable per-page checksums in LMDB 1.0, in which case you'll just > > get > > an error code > > >>> if a page is corrupted (and the checksum fails to match). The DB will > > >>> still > > be unusable if > > >>> anything is corrupted. > > >> That would fix the problem properly. Does it check that it is the > > >> correct > > transaction as well (e.g. by putting a transid into the page like btrfs)? > > Returning > > >> wrong results or MDB_CORRUPTED is something my application can handle > > >> (but > > not crashes obviously). > > >> > > >> The txnid is part of the page header, which is one of the incompatible > > format changes from LMDB 0.9. > > >> This is what allows us to eliminate the dirty bit. > > >> > > >> Not sure what you mean about the txnid being correct or not, but > > >> certainly > > it is included in the > > >> checksum. > > > > A common problem that e.g. btrfs users encounter is that a disk drops > If "a disk drops some writes" it's definitely not a problem of BtrFS. Dor > you mean "BtrFS drops some writes"? > I don't get it. >
It was meant as disk drops some writes and btrfs users notice it because it does this checksumming + transid check (and ask online for help about their now broken btrfs because it doesn't have good repair tools). See here for a btrfs user testing disks for this problem: https://lore.kernel.org/linux-btrfs/[email protected]/T/ W.r.t. to LMDB in e.g. LDAP you could say "don't use broken disks then". But as a general purpose database (with checksumming) it would be something nice to have. > > some writes. If there was a page at the same location previously the > > > > checksum check succeeds. But btrfs stores the transid of the page in the > > page's parent, so it compares that as well (The error message is "btrfs > > parent transid verify failed on OFFSET wanted TRANSID found TRANSID"). I > > think ZFS stores the checksum of the page in the page's parent as well > > (idk if this would work with lmdbs b-tree). > > > > I guess a simple check (that might already exist) is checking if page > > transid<=root/meta page transid. But that doesn't catch the cases where > > the root page was updated, but updates to other pages were dropped (the > > disk might also drop a complete "transaction" but not report an error, > > in which case the next transaction then writes the root page pointing to > > an incompletely written tree).
