On Tue, Nov 08, 2005 at 04:18:40PM +0100, Juliusz Chroboczek wrote: > The trouble with that is that Darcs plays the commutation game, which > invalidates any form of end-to-end hash. The obvious solution would > be to compute hashes in a minimal context, but I'm not sure whether > that would be computationally feasible.
I think I see. You mean to store the patches as always, but to compute (and check) their hashes in a minimal context? That's a clever idea. Somehow I've been caught up on the "signed patch bundle" idea, which requires a new repository format, and didn't think of the idea of storing patches in their current format, but commuting them into another context for hashing. To create a unique hash for a patch, ideally we'd also want to be sure to hash some sort of canonical patch representation. I'm thinking in terms of the repo-format ideas, that we'd need to ensure that for each patch type there's a canonical hash. It should cost at worst O(N^2) to verify a single patch (assuming no exponentially expensive commutes) where N is the number of patches in the repository. That makes it O(N^3) to verify a repository. Doesn't sound particularly pleasant, but if commutation and patch-parsing were made fast enough, it could be feasible, especially since we wouldn't often need to verify an entire repository, and many patches won't cost anything close to O(N^2) to verify, so the prefactor may be small. One could also store a set of hashes for each patch, starting with a minimal-context hash, but then followed by (optionally) a set of hashes in the context of specific tags. This could reduce the worst-case cost of verifying a given patch to O(n^2) where where n is the number of patches between tags, but wouldn't give us a unique hash for each patch. The unique hash is quite appealing, but if it's not feasible, it's infeasible. To be practical, the tag-context hashes would probably need to be combined with a unique-hash scheme for tags, which requires that you have unique hashes for each of the patches in the tag, which I think means your minimal-context hash. (It also requires sorting the patches in the tag, but that's trivial.) So I can only imagine tag-context hashes as an optimization for certain work-flows, in that it could speed up verified pulls considerably, but wouldn't help at all with the cost of recording a hashed patch. But recording a unique-hashed patch is "only" O(N^2) anyways... -- David Roundy http://www.darcs.net _______________________________________________ darcs-users mailing list [email protected] http://www.abridgegame.org/mailman/listinfo/darcs-users
