Package: release.debian.org Severity: normal Tags: bullseye User: release.debian....@packages.debian.org Usertags: pu X-Debbugs-Cc: xapian-c...@packages.debian.org Control: affects -1 + src:xapian-core
[ Reason ] This is a targetted fix for a potential database corruption if switching the new revision live fails with ENOSPC but the recovery process does NOT get ENOSPC: https://bugs.debian.org/1032398 It looks like all previous 1.4.x releases are affected. This is a regression compared to Xapian 1.2.x (which was in jessie). The fix here is taken from upstream's 1.4.22 release and is the simplest way to address the problem: simply reread the current version file from disk which means the in memory state will match the previously committed state. [ Impact ] This can result in corrupted databases for users if their partition fills up while indexing. It doesn't happen every time, but it's definitely been hit by one notmuch user on Debian, and likely explains a smattering of reports of database corruption over the years. [ Tests ] There's a pretty thorough upstream testsuite for xapian-core. The specific scenario of ENOSPC isn't currently tested by that, but we've exercised it by hand by injecting an ENOSPC failure using strace, which reliably triggers corruption without the patch and no corruption with it. The fix was include in the upstream 1.4.22 release which was released 2023-02-02 (6 weeks ago) and uploaded to unstable the same day - there haven't been any reports of issue with it. [ Risks ] This is a really low risk change. It only touches the code path taken when a commit operation fails to write to disk, and does a more complete reset of state in that case. The rollback being done here is actually more complicated than necessary (the multiple tables are now committed together atomically, but used to be committed one-by-one which required extra care to roll-back). That's been cleaned up on upstream git master, but I've gone for the simpler and less invasive fix from upstream 1.4.x. [ Checklist ] [*] *all* changes are documented in the d/changelog [*] I reviewed all changes and I approve them [*] attach debdiff against the package in stable [*] the issue is verified as fixed in unstable [ Changes ] The change switches from calling a function which attempts to roll-back the in-memory state directly (but gets it wrong in this situation) to one which resets the in-memory state to what it would be if the database was opened afresh. Cheers, Olly
diff -Nru xapian-core-1.4.18/debian/changelog xapian-core-1.4.18/debian/changelog --- xapian-core-1.4.18/debian/changelog 2021-02-24 07:33:41.000000000 +1300 +++ xapian-core-1.4.18/debian/changelog 2023-03-17 11:20:07.000000000 +1300 @@ -1,3 +1,15 @@ +xapian-core (1.4.18-3+deb11u1) bullseye; urgency=medium + + * debian/patches/fix-db-corruption-on-ENOSPC.patch: New patch to + fix potential database corruption if switching the new revision + live fails with ENOSPC but the recovery process does NOT get ENOSPC. + The fix here is taken from upstream's 1.4.22 release and is the simplest + way to address the problem: simply reread the current version file + from disk which means the in memory state will match the previously + committed state. Closes: #1032398 + + -- Olly Betts <o...@survex.com> Fri, 17 Mar 2023 11:20:07 +1300 + xapian-core (1.4.18-3) unstable; urgency=medium * debian/rules: Workaround testcase sensitivity to excess precision by diff -Nru xapian-core-1.4.18/debian/patches/fix-db-corruption-on-ENOSPC.patch xapian-core-1.4.18/debian/patches/fix-db-corruption-on-ENOSPC.patch --- xapian-core-1.4.18/debian/patches/fix-db-corruption-on-ENOSPC.patch 1970-01-01 12:00:00.000000000 +1200 +++ xapian-core-1.4.18/debian/patches/fix-db-corruption-on-ENOSPC.patch 2023-03-17 11:20:07.000000000 +1300 @@ -0,0 +1,40 @@ +commit 90f7a35483b4cf7dd848c34634803bf28f95081d +Author: Olly Betts <o...@survex.com> +Date: Wed Jan 25 11:40:44 2023 +1300 + + Fix recovery from failed commit + + If renaming to switch the new version file live fails (e.g. due to + ENOSPC) we discard the changes, try to write and switch to a different + new version file with an increased revision (on failure of this too we + close the database), and throw DatabaseError. + + Unfortunately the roll-back of state is not complete, and if switching + to the different new version file succeeds that bad state persists on + disk. + + Thanks to Uwe Kleine-König for reporting and coming up with the idea + to reproduce using strace to inject a rename() failure - this is a + simple reproducer: + + rm -rf enospc.db + strace -e inject=rename:error=ENOSPC:when=2 examples/simpleindex enospc.db < INSTALL + xapian-check enospc.db + + No automated regression test for this yet as this doesn't trivially + fit into the existing testsuite framework, but we ought to have + tests using fault injection. + + (cherry picked from commit 9f9aad17893bde4acb3a98e60dde397c346fcd9a) + +--- a/backends/glass/glass_database.cc ++++ b/backends/glass/glass_database.cc +@@ -619,7 +619,7 @@ + cancel(); + + // Reopen tables with old revision number. +- version_file.cancel(); ++ version_file.read(); + docdata_table.open(flags, version_file.get_root(Glass::DOCDATA), old_revision); + spelling_table.open(flags, version_file.get_root(Glass::SPELLING), old_revision); + synonym_table.open(flags, version_file.get_root(Glass::SYNONYM), old_revision); diff -Nru xapian-core-1.4.18/debian/patches/series xapian-core-1.4.18/debian/patches/series --- xapian-core-1.4.18/debian/patches/series 1970-01-01 12:00:00.000000000 +1200 +++ xapian-core-1.4.18/debian/patches/series 2023-03-17 11:20:07.000000000 +1300 @@ -0,0 +1 @@ +fix-db-corruption-on-ENOSPC.patch