Dear Nico Williams: Thanks for the reference! Very cool.
What I would most want is for ZFS (and every other filesystem) to maintain a Merkle Tree over the file data with a good secure hash. Whenever a change to a file is made, the filesystem can update the Merkle Tree this with mere O(log(N)) work in the size of the file plus O(N) work in the size of the change. For a modern filesystem like ZFS which is already maintaining a checksum tree the *added* cost of maintaining the secure hash Merkle Tree could be minimal. Then, the filesystem should make this Merkle Tree available to applications through a simple query. This would enable applications—without needing any further in-filesystem code—to perform a Merkle Tree sync, which would range from "noticeably more efficient" to "dramatically more efficient" than rsync or zfs send. :-) Of course it is only more efficient because we're treating the maintenance of the secure-hash Merkle Tree as free. There are two senses in which this is legitimate and it is almost free: 1. Since the values get maintained persistently over the file's lifetime then the total computation required is approximately O(N) where N is the total size of all deltas that have been applied to the file in its life. (Let's just drop the logarithmic part for now, because see 2. below.) Compare this to the cost of doing a fast, insecure CRC over the whole file such as in rsync. The cost of that is O(N) * K where N is the (then current) size of the file and K is the number of times you run rsync on that file. The extreme case is if the file hasn't changed. Then for the application-level code to confirm that the file on this machine is the same as the file on that machine, it merely has to ask the filesystem for the root hash on each machine and transmit that root hash over the network. This is optimally fast compared to rsync, and unlike "zfs send|recv" it is optimally fast whenever the two files are identical even if they have both changed since the last time they were synced. 2. Since the modern, sophisticated filesystem like ZFS is maintaining a tree of checksums over the data *anyway* you can piggy-back this computation onto that work, avoiding any extra seeks and minimizing extra memory access. In fact, ZFS itself can actually use SHA-256 for the checksum tree, which would make it almost provide exactly what I want, except for: 2. a. From what I've read, nobody uses the SHA-256 configuration in ZFS because it is too computationally expensive, so they use an insecure checksum (fletcher2/4) instead. 2. b. I assume the shape of the resulting checksum tree is modified by artifacts of the ZFS layout instead of being a simple canonical shape. This is a show-stopper for this use case because if the same file data exists on a different system, and some software on that system computes a Merkle Tree over the data, it might come out with different hashes than the ZFS checksum tree, thus eliminating all of the performance benefits of this approach. But, if ZFS could be modified to fix these problems or if a new filesystem would add a feature of maintaining a canonical, reproducible Merkle Tree, then it might be extremely useful. Thanks to Brian Warner and Dan Shoutis for discussions about this idea. Regards, Zooko _______________________________________________ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography