On Mon, Dec 04, 2006 at 04:34:48PM +0000, Simon Marlow wrote: > David Roundy wrote: > > >I've been working hard on getting support for the new hashed inventory > >format into good shape. If you aren't familiar with the benefits of the > >new format (which I've talked about with at least some of you in person), > >suffice to say that I see it as a precursor to working out the new way of > >dealing with conflicts. > > As an interested bystander, I'd really like to hear a brief description of > what a "hashed inventory" is, and what benefits it brings. Not a 12-page > paper, just a quick outline will do fine, I don't want to distract you from > the hacking frenzy :)
A hashed inventory is a modification of the darcs repository format, which essentially replaces the _darcs/inventory file (which is human-readable, if not human-modifiable, so if you're not familiar with it, you could take a look) with a _darcs/hashed_inventory file. The difference is that a hash of the contents of each patch is stored, along with the identifier of the patch, as is currently stored. This hash is then used as the filename in _darcs/patches/. This has several benefits. At the most obvious level, we've now got some extra information for checking the consistency of a repository (helpful if, e.g. an http proxy modifies files in transit). The next advantage is that by cryptographically signing the hashed inventory, you cryptographically sign the entire contents of the repository (unless someone cracks sha1). This is potentially valuable to high-profile projects, or projects that use untrusted mirrors. Next, because the filename for patches now depends on patch contents, all darcs commands will be atomic (except with respect to the pristine cache--but atomic with respect to remote access), including those that currently aren't, such as amend-record and obliterate. With hashed inventories it will be possible to implement "lazy" partial repositories, in which darcs downloads patch files as needed to do the commands you ask, since we'll have the hash with which to verify that the patch files haven't been commuted (and therefore are still in the proper context for our use). Finally, as I mentioned above, the refactoring for this change should help with our plans for new conflict handling, which will probably require that we break the current picture of one patch file per named patch (which wouldn't work in the current scheme where the patch filename is determined by the name of the patch). -- David Roundy Department of Physics Oregon State University _______________________________________________ darcs-devel mailing list [email protected] http://www.abridgegame.org/cgi-bin/mailman/listinfo/darcs-devel
