Sure and there are very many similar things. But still, these are applicable on a distributed environment (HPC, the cloud, whatever), not your laptop, which is what I was talking about.
On Thu, Mar 3, 2016 at 2:00 PM, W. Trevor King <[email protected]> wrote: > On Thu, Mar 03, 2016 at 01:38:43PM -0700, Davide Del Vento wrote: >> I know this is suboptimal, but I think that's the best you can do at >> the moment (and that assumes that at least one dataset would fit in >> your disk, which for climate datasets could be a generous >> assumption). > > Depending on how you organize/access your data, IPFS [1] might be a > good solution for distributing your data over multiple machines while > still being able to easily access the subset you need from a single > host. For examlpe, if your huge data is setup like > > . > |-- 2014 > | `-- … > |-- 2015 > | `-- … > `-- 2016 > `-- … > > IPFS would be good if you only needed one year at a time on the local > disk. It wouldn't be good if you needed January data across a range > of years, unless someone had also setup an index by month: > > . > |-- 01 > | `-- … > |-- 02 > | `-- … > |-- 03 > … `-- … > > The data is content-addressable, so 2014/01/some-data (via the first > indexing scheme) and 01/2014/some-data (via the second indexing > scheme) would both use the same local object for the ‘some-data’ leaf. > > And while there are plans to build Git-like version control onto IPFS, > I don't think anyone has gotten around to that yet. With the current > version, you get immutable Merkle hashes that uniquely identify your > data [2], but you don't have commit objects linking those snapshots > together. > > Anyhow, IPFS is still pretty new and fluxy, so I wouldn't trust it as > the sole location of important data, but folks who are bumping up > against data management issues might want to give it a spin. > > Cheers, > Trevor > > [1]: https://ipfs.io/ > [2]: https://en.wikipedia.org/wiki/Merkle_tree > > -- > This email may be signed or encrypted with GnuPG (http://www.gnupg.org). > For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy _______________________________________________ Discuss mailing list [email protected] http://lists.software-carpentry.org/mailman/listinfo/discuss_lists.software-carpentry.org
