danie...@apache.org wrote on Tue, Feb 28, 2012 at 06:23:29 -0000:
> Author: danielsh
> Date: Tue Feb 28 06:23:28 2012
> New Revision: 1294479
> 
> URL: http://svn.apache.org/viewvc?rev=1294479&view=rev
> Log:
> Check for issue #4129 (bogus predecessor fields in the root node-revision) in
> 'svnadmin verify'.
> 
> Notes:
> 
> - This check precedes all the other non-backend-specific check.  A subsequent
>   revision will address that.
> 
> - Interaction with FSFS caches: see comment within.
> 
...
> @@ -7919,6 +7923,48 @@ svn_fs_fs__verify(svn_fs_t *fs,
>                                            cancel_func, cancel_baton,
>                                            pool));
>  
> +  /* Issue #4129: bogus pred-counts on the root node-rev. */
> +  {
> +    svn_revnum_t i;
> +    int predecessor_predecessor_count = -1;
> +
> +    for (i = 0; i <= youngest; i++)
> +      {
> +        /* ### Caching.
> +
> +           svn_fs_fs__rev_get_root() consults caches, which in verify we
> +           don't want.  But we can't easily bypass that, as
> +           svn_fs_revision_root()+svn_fs_node_id()'s implementation uses
> +           svn_fs_fs__rev_get_root() too.
> +
> +           ### A future revision will make fs_verify() disable caches when it
> +           ### opens ffd.

Having had another look at this, I'm not sure to what extent that is
necessary.

---

All caches live in fs_fs_data_t.  The cache prefix embeds the UUID and
the OS path to the filesystem root (/srv/svn/repos/calc/db, or so) (see
svn_fs_fs__initialize_caches()).

The cache types are three: 'inprocess', 'membuffer', or 'memcache'.  The
first two are per-process, but the third need not be.

The net result of all this is that verify() cannot be sure that, if it
got a cached data, it is a cache of information computed from the
repository it is interrogating: in memcache situations the information
may have been installed in the cache by another memcache client (running
on a different box), and even in in-process situations the following
scenario would invalidate the cache:

    % svnadmin dump repos > backup.dump
    % vi repos
    % svnadmin is-repos-corrupt repos
    yes
    % svnadmin create repos2
    % svnadmin load -q repos2 < backup.dump
    % mv repos rm && mv repos2 repos

(And no, I don't think it's contrived.  I imagine that's a fairly common
thing to do --- to restore a backup repository to its original name.)

---

So, I'm debating between:

- Disable all caches in svn_fs_verify().

- Disable memcached caches in svn_fs_verify().

- None of the above, but document that users of svn_fs_verify() should
  disable caches.  (And do that in libsvn_repos and/or 'svnadmin verify',
  too.)

Thoughts?

> +         */
> +        svn_fs_id_t *root_id;
> +        node_revision_t *root_noderev;
> +
> +        if ((i % 100) == 0) /* uneducated guess */
> +          svn_pool_clear(iterpool);
> +
> +        /* Fetch ROOT_NODEREV. */
> +        SVN_ERR(svn_fs_fs__rev_get_root(&root_id, fs, i, iterpool));
> +        SVN_ERR(svn_fs_fs__get_node_revision(&root_noderev, fs, root_id,
> +                                             iterpool));
> +
> +        /* Check correctness. (Compare validate_root_noderev().) */
> +        if (1+predecessor_predecessor_count != 
> root_noderev->predecessor_count)
> +          return svn_error_createf(SVN_ERR_FS_CORRUPT, NULL,
> +                                   _("predecessor count for "
> +                                     "the root node-revision is wrong: "
> +                                     "r%ld has %d, but r%ld has %d"),
> +                                   i, root_noderev->predecessor_count,
> +                                   i-1, predecessor_predecessor_count);
> +
> +        predecessor_predecessor_count = root_noderev->predecessor_count;
> +      }
> +  }
> +
> +  svn_pool_destroy(iterpool);
>    return SVN_NO_ERROR;
>  }
>  
> 
> 

Reply via email to