michael schuster wrote:
Roland Rambau wrote:
gang,

actually a simpler version of that idea would be a "zcp":

if I just cp a file, I know that all blocks of the new file
will be duplicates; so the cp could take full advantage for
the dedup without a need to check/read/write anz actual data

I think they call it 'ln' ;-) and that even works on ufs.

Michael
+1

More and more it sounds like an optimization that will either

A. not add much over dedup

or

B. have value only in specific situations - and completely misbehave in other situations (even the same situations after passage of time)

Why not just make a special-purpose application (completely user-land) for it? I know, 'ln' is remotely kin of this idea but, 'ln' is POSIX and people know what to expect. What you'd practically need to do is whip up a vfs layer that exposes the underlying blocks of a filesystem and possibly name them by their SHA256 or MD5 hash. Then you'd need (another?) vfs abstraction that allows 'virtual' files to be assembled from these blocks in multiple independent chains.

I know there is already a fuse implementation of the first vfs driver (the name evades me, but I think it was something like chunkfs[1]) and one could at least whip up a reasonable read-only Proof-of-Concept of the second part.

The reason _I_ wouldn't do that is because, I'm already happy with e.g.:

   mkfifo /var/run/my_part_collector
(while true; do cat /local/data/my_part_* > /var/run/my_part_collector; done)&
   wc -l /var/run/my_part_collector

The equivalent of this could be (better) expressed in C, perl or any language of your choice). I believe this is all POSIX. [1] The reason this exists is obviously for backup and synchronization implementations: it will make it possible to backup files using rsync when the encryption key is not available to the backup process (with a EBC mode crypto algo); it should make it 'simple' to synchronize ones large monolythic files with e.g. Amazon S3 cloud storage etc. etc.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to