On Tue, Nov 08, 2005 at 04:18:40PM +0100, Juliusz Chroboczek wrote:
> The trouble with that is that Darcs plays the commutation game, which
> invalidates any form of end-to-end hash.  The obvious solution would
> be to compute hashes in a minimal context, but I'm not sure whether
> that would be computationally feasible.

I think I see.  You mean to store the patches as always, but to compute
(and check) their hashes in a minimal context? That's a clever idea.
Somehow I've been caught up on the "signed patch bundle" idea, which
requires a new repository format, and didn't think of the idea of storing
patches in their current format, but commuting them into another context
for hashing.

To create a unique hash for a patch, ideally we'd also want to be sure to
hash some sort of canonical patch representation.  I'm thinking in terms of
the repo-format ideas, that we'd need to ensure that for each patch type
there's a canonical hash.

It should cost at worst O(N^2) to verify a single patch (assuming no
exponentially expensive commutes) where N is the number of patches in the
repository.  That makes it O(N^3) to verify a repository.  Doesn't sound
particularly pleasant, but if commutation and patch-parsing were made fast
enough, it could be feasible, especially since we wouldn't often need to
verify an entire repository, and many patches won't cost anything close to
O(N^2) to verify, so the prefactor may be small.

One could also store a set of hashes for each patch, starting with a
minimal-context hash, but then followed by (optionally) a set of hashes in
the context of specific tags.  This could reduce the worst-case cost of
verifying a given patch to O(n^2) where where n is the number of patches
between tags, but wouldn't give us a unique hash for each patch.  The
unique hash is quite appealing, but if it's not feasible, it's infeasible.

To be practical, the tag-context hashes would probably need to be combined
with a unique-hash scheme for tags, which requires that you have unique
hashes for each of the patches in the tag, which I think means your
minimal-context hash.  (It also requires sorting the patches in the tag,
but that's trivial.)  So I can only imagine tag-context hashes as an
optimization for certain work-flows, in that it could speed up verified
pulls considerably, but wouldn't help at all with the cost of recording a
hashed patch.  But recording a unique-hashed patch is "only" O(N^2)
anyways...
-- 
David Roundy
http://www.darcs.net

_______________________________________________
darcs-users mailing list
[email protected]
http://www.abridgegame.org/mailman/listinfo/darcs-users

Reply via email to