On 23.05.2015 13:07, Daniel Shahaf wrote: > Evgeny Kotkov wrote on Thu, May 21, 2015 at 18:23:22 +0300: >> SVN_ERR(svn_repos_verify_fs2(repos, 1, 5, NULL, NULL, NULL, NULL, pool)); >> >> ...would return two different errors depending on the binaries being used, >> assuming that one of the revisions in the [r1:r5] range is corrupted: >> >> (With 1.8.13) E160004: Checksum mismatch in item at offset 0 of >> length 59 bytes in file path/asf/db/revs/0/2 >> >> (With 1.9.0-rc1) E165011: Repository 'path/asf' failed to verify > ... >> Thoughts? > I think the incumbent code doesn't use SVN_ERR_REPOS_CORRUPT (E165011) > correctly. > > The keep-going implementation simply uses SVN_ERR_REPOS_CORRUPT to wrap > whatever error the FS layer reported. That's wrong, for two reasons: > > - The SVN_ERR_REPOS_CORRUPT code should be used to indicate corruption > of repos-layer on-disk data (such as $REPOS_DIR/db not existing or > $REPOS_DIR/format being a directory). The code does not look for such > conditions. In fact, since the repos layer got around to calling into > the FS layer, the repos layer itself is almost certainly integral. > > - The repos layer has no business second-guessing the FS layer's choice > of error code. For example, a permission error on a rev file could > happen and does not indicate corruption, or a user might have pressed > ^C during svn_fs_verify()'s execution. The incumbent code assumes > that any non-SVN_ERR_CANCELLED code indicates a corruption; that > assumption is wrong. > > So, the keep-going code should stop using SVN_ERR_REPOS_CORRUPT to > indiscriminately wrap svn_fs_verify()'s error code. That should make > the E160004 (SVN_ERR_FS_CORRUPT) error visible, addressing Evgeny's > issue. (I am almost sure that's exactly what Evgeny's patch does.) > > Furthermore, I think the "if (found_corruption)" block at the end of > svn_repos_verify_fs3() should also avoid using SVN_ERR_REPOS_CORRUPT, > since that would be the wrong code to use in some circumstances (for > example, if all revision files had permission errors). The code should > either say what it _does_ know for a fact — "N out of M revisions failed > to verify" — or start inspecting the FS errors to determine whether they > indicate corruption or transient errors (which isn't always possible, > since the repos layer does not know the context the error was raised in). > > In fact, since the aforementioned two uses of SVN_ERR_REPOS_CORRUPT are > the only two uses of that code, I conclude that I am of the opinion that > that error code should be deleted entirely from the 1.9 codebase. We > can always revive it in 1.10 if needed.
I completely agree with your analysis and proposal. -- Brane