Daniel Shahaf wrote:
Julian Foad wrote:
> Hi Paul. I'm +1 on the concept that implementing content hashes in
> Subversion would be useful. I think if we were designing Subversion today,
> the question would be "Why on earth wouldn't we design in a Merkle tree
> content hash?" as it is obviously (to those who have already thought about
> it) useful for these sorts of operation, for people building functionality
> on top of Subversion.
I appreciate that that's your opinion, but I'm going to play devil's
advocate and question it.
The only operation one can do with a content hash is compare it to
another content hash. Our API already has an object with this property:
svn_fs_id_t. The equality relation of node-rev id's is a refinement of
the equality relation of content hashes: equal node-rev id's imply equal
content hashes, but the converse is not true.
What would content hashes provide that comparing node-rev id's would not?
1. A node-rev id only exists for a tree that has been committed to the
repository: there is no way to generate a node-rev id for an external tree of
content client-side. Note what Paul Hammant wrote about the use case:
"I need to compare to a *local* representation of the same tree that's not under
subversion control"
2. As you point out, equal content does not imply equal node-rev-ids. The large
doc-string above svn_fs_id_t says it this way:
"note: Commonly, a node revision will have the same content as some other node
revisions in the same node and in different nodes. [...]"
Thanks for questioning. That draws out some important points.
Regards,
- Julian