Re: Queries about rep cache: get_shared_rep()

Stefan Fuhrmann Wed, 27 May 2015 03:18:35 -0700

On Tue, May 26, 2015 at 11:15 PM, Julian Foad <[email protected]> wrote:


> Index: subversion/libsvn_fs_fs/transaction.c
> ===================================================================
> --- subversion/libsvn_fs_fs/transaction.c    (revision 1681856)
> +++ subversion/libsvn_fs_fs/transaction.c    (working copy)
> @@ -2243,12 +2243,14 @@ (representation_t **old_re
>        const char *file_name
>          = path_txn_sha1(fs, &rep->txn_id, rep->sha1_digest, scratch_pool);
>
>        /* in our txn, is there a rep file named with the wanted SHA1?
>           If so, read it and use that rep.
>         */
> +      /* ### Would it be faster (and so better) to just try reading it,
> +             and handle ENOENT, instead of first checking for presence? */
>        SVN_ERR(svn_io_check_path(file_name, &kind, scratch_pool));
>        if (kind == svn_node_file)
>          {
>            svn_stringbuf_t *rep_string;
>            SVN_ERR(svn_stringbuf_from_file2(&rep_string, file_name,
>                                             scratch_pool));
>

This checks for duplicate representations within the same txn.
It mainly finds them in a-typical or infrequent scenarios like:

* setting the same property list on multiple nodes
* committing the same contents to multiple branches in the
  same txn
* loading a non-incremental dump file

I personally prefer checking for "preconditions" over just trying
and then dissecting various error cases. Apart from that style
issue, frequently constructing error codes can be expensive
(when messages are localized).

@@ -2262,14 +2264,20 @@ get_shared_rep(representation_t **old_re
>
>    /* A simple guard against general rep-cache induced corruption. */
>    if ((*old_rep)->expanded_size != rep->expanded_size)
>      {
>        /* Make the problem show up in the server log.
>
> -         Because not sharing reps is always a save option,
> +         Because not sharing reps is always a safe option,
>           terminating the request would be inappropriate.
>

Oops. Fixed in r1681949.

+
> +         ### [JAF] Why should we assume the cache is corrupt rather than
> the
> +                   new rep is corrupt? Is this really the logic we want?
>

Hm. We don't say "the cache is corrupt" but log a warning with
a "FS corruption" error code and tell the user that there is an
inconsistency / mismatch.

It is simply our experience so far that it is more likely that
the rep-cache.db contents is at fault rather than the rep
we just calculated the sha1 for.


> +
> +                   Should we do something more forceful -- flag the cache
> as
> +                   unusable, perhaps -- rather than just logging a
> warning?
>         */
>

I don't have a particularly strong opinion on that one. I guess
it is very unlikely that the mismatch has "legitimately" been
caused by e.g. a SHA1 collision. Maybe, we should reset /
clear the rep-cache.db at that point and say so in the warning.

-- Stefan^2.

Re: Queries about rep cache: get_shared_rep()

Reply via email to