On Tue, May 26, 2015 at 11:15 PM, Julian Foad <julianf...@gmail.com> wrote:
> Index: subversion/libsvn_fs_fs/transaction.c > =================================================================== > --- subversion/libsvn_fs_fs/transaction.c (revision 1681856) > +++ subversion/libsvn_fs_fs/transaction.c (working copy) > @@ -2243,12 +2243,14 @@ (representation_t **old_re > const char *file_name > = path_txn_sha1(fs, &rep->txn_id, rep->sha1_digest, scratch_pool); > > /* in our txn, is there a rep file named with the wanted SHA1? > If so, read it and use that rep. > */ > + /* ### Would it be faster (and so better) to just try reading it, > + and handle ENOENT, instead of first checking for presence? */ > SVN_ERR(svn_io_check_path(file_name, &kind, scratch_pool)); > if (kind == svn_node_file) > { > svn_stringbuf_t *rep_string; > SVN_ERR(svn_stringbuf_from_file2(&rep_string, file_name, > scratch_pool)); > This checks for duplicate representations within the same txn. It mainly finds them in a-typical or infrequent scenarios like: * setting the same property list on multiple nodes * committing the same contents to multiple branches in the same txn * loading a non-incremental dump file I personally prefer checking for "preconditions" over just trying and then dissecting various error cases. Apart from that style issue, frequently constructing error codes can be expensive (when messages are localized). @@ -2262,14 +2264,20 @@ get_shared_rep(representation_t **old_re > > /* A simple guard against general rep-cache induced corruption. */ > if ((*old_rep)->expanded_size != rep->expanded_size) > { > /* Make the problem show up in the server log. > > - Because not sharing reps is always a save option, > + Because not sharing reps is always a safe option, > terminating the request would be inappropriate. > Oops. Fixed in r1681949. + > + ### [JAF] Why should we assume the cache is corrupt rather than > the > + new rep is corrupt? Is this really the logic we want? > Hm. We don't say "the cache is corrupt" but log a warning with a "FS corruption" error code and tell the user that there is an inconsistency / mismatch. It is simply our experience so far that it is more likely that the rep-cache.db contents is at fault rather than the rep we just calculated the sha1 for. > + > + Should we do something more forceful -- flag the cache > as > + unusable, perhaps -- rather than just logging a > warning? > */ > I don't have a particularly strong opinion on that one. I guess it is very unlikely that the mismatch has "legitimately" been caused by e.g. a SHA1 collision. Maybe, we should reset / clear the rep-cache.db at that point and say so in the warning. -- Stefan^2.