Re: Shelve & checkpoint - next steps

Julian Foad Thu, 10 May 2018 04:33:08 -0700

Julian Foad wrote on 2018-04-16:

Next steps for shelve & checkpoint.
* change the storage so we can shelve (large) binary files

* API abstraction to access shelf data

* store and retrieve base revisions
* more complete testing -- we should have a way of testing all possiblekinds of change
=== Storage for binary files ===
I made the shelve function walk the WC itself (r1829291), so we canintercept binary files at that point and do something other than diffwith them.
From r1829295, for binary files it uses git diff binary literal format.This works for a stop-gap, but is inefficient for large files.

Soon I will change it at that interception point, like this. It willstore a 'binary' file by copying the working version into a directorystructure that parallels the WC directory structure, inside'.svn/shelves/<shelf-name-encoded>-<version>.d/', instead of storing a(git binary literal) diff in the patch file.

The file's properties can continue to go into the patch file as aproperty-diff, for the time being.


That should be fast enough for use with very large files.

(And what does being classed as a 'binary' file mean, semantically? Itmeans when a modified binary file is shelved and later re-applied to theWC, the modifications will not be merged, not even in the 'patch' sense,but instead the file will be copied as a whole, in the same way that'svn update' and 'svn merge' handle a binary file.)

I have implemented the basics of this. I haven't finished the part thedetects a conflict when unshelving. When that's done I'll commit.

Ideally shelves will be able to share the WC pristine store for storingwhole file contents. [...]
=== API abstraction ===
We need libsvn_client APIs to be able to access shelves in the same wayas "regular" WC data: export|diff|cat|propget|... for data stored in anyshelf. The result of any such API operating on a shelf should beanalogous to how the same function would operate on the WC if we firstunshelved the change.

Why do we need generic APIs to support these kinds of functions?, wemight ask. It's not because the user necessarily needs all theseoperations, but to make programming sane. It should be possible to writea conceptually simple high-level operation such as "copy all the changesfound in this WC subtree to this shelf" by setting up the source anddestination objects and then invoking a common "copy a tree of changes"routine, not by writing a new deep implementation of all the gutsspecifically for this source-and-destination pair.

I have started working on such APIs in the 'tree-api' branch (recentlyresurrected from my years-old 'tree-read-api' branch).

A possible starting point, currently implemented on the'shelve-checkpoint' branch, is to modify svn_opt_revision_t and therevision-number parsing to accept a shelf name as another kind ofrevision specifier. This (and the other revprop functions) works so far:
   $ svn propget -r foo --revprop svn:log
   This is the log message of shelf 'foo'.


=== Store and retrieve base revisions ===
Storing the revision number metadata is easy. Svn diff format has alwayswritten the base revision of each file in the diff header. The recent'svn info --viewspec' prototype now provides a way to write a completedescription of the revisions and 'shape' (depth and switch settings) ofa WC.
Reading it out is a SMOP. Doing something with it -- that is, doing a3-way-merge instead of a 'patch' operation -- is conceptually a SMOP butprobably more involved.
Snapshotting the actual content of the base is much more involved if weintend to keep this snapshot attached to each the shelf even though theuser runs 'update'. In order to decide whether it is important to do so,I suggest we implement making use of just the revision number metadataand test its performance -- accepting that either repository access orfallback to plain patching would be needed in cases where 'update' hasbeen done.
Glad to hear any thoughts.

- Julian

Re: Shelve & checkpoint - next steps

Reply via email to