On Tue, Nov 18, 2014 at 05:30:48PM -0800, Preston Holmes wrote:
> I thought of the SWC audience when I recently came across…

Also in the tangential line: I mentioned IPFS [1] on #sciencelab a few
weeks ago, but this seems like a good time to mention it here.  IPFS
is about Git-like content-addressable storage for everything.  It's
still a work in progress, but the particular bits that will make it
useful for large datasets are:

* It gives you the option to break large files into lists of blobs and
  lists [2,3], so you can deduplicate stable chunks (using the
  algorithm of your choice to chunk the file [3]).
* It's distributed, so you can host the large files (or chunks of
  them) somewhere with more space, and only checkout what you need
  locally.

Cheers,
Trevor

[1]: http://ipfs.io/
[2]: 
https://github.com/jbenet/ipfs/blob/f05ac1b41666b0134298fd1fe56d95c02a6ddb11/papers/ipfs-cap2pfs/ipfs-cap2pfs.tex#L748
[3]: 
https://github.com/jbenet/ipfs/blob/f05ac1b41666b0134298fd1fe56d95c02a6ddb11/papers/ipfs-cap2pfs/ipfs-cap2pfs.tex#L916

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Discuss mailing list
[email protected]
http://lists.software-carpentry.org/mailman/listinfo/discuss_lists.software-carpentry.org

Reply via email to