Sure and there are very many similar things. But still, these are
applicable on a distributed environment (HPC, the cloud, whatever),
not your laptop, which is what I was talking about.

On Thu, Mar 3, 2016 at 2:00 PM, W. Trevor King <[email protected]> wrote:
> On Thu, Mar 03, 2016 at 01:38:43PM -0700, Davide Del Vento wrote:
>> I know this is suboptimal, but I think that's the best you can do at
>> the moment (and that assumes that at least one dataset would fit in
>> your disk, which for climate datasets could be a generous
>> assumption).
>
> Depending on how you organize/access your data, IPFS [1] might be a
> good solution for distributing your data over multiple machines while
> still being able to easily access the subset you need from a single
> host.  For examlpe, if your huge data is setup like
>
>   .
>   |-- 2014
>   |   `-- …
>   |-- 2015
>   |   `-- …
>   `-- 2016
>       `-- …
>
> IPFS would be good if you only needed one year at a time on the local
> disk.  It wouldn't be good if you needed January data across a range
> of years, unless someone had also setup an index by month:
>
>   .
>   |-- 01
>   |   `-- …
>   |-- 02
>   |   `-- …
>   |-- 03
>   …   `-- …
>
> The data is content-addressable, so 2014/01/some-data (via the first
> indexing scheme) and 01/2014/some-data (via the second indexing
> scheme) would both use the same local object for the ‘some-data’ leaf.
>
> And while there are plans to build Git-like version control onto IPFS,
> I don't think anyone has gotten around to that yet.  With the current
> version, you get immutable Merkle hashes that uniquely identify your
> data [2], but you don't have commit objects linking those snapshots
> together.
>
> Anyhow, IPFS is still pretty new and fluxy, so I wouldn't trust it as
> the sole location of important data, but folks who are bumping up
> against data management issues might want to give it a spin.
>
> Cheers,
> Trevor
>
> [1]: https://ipfs.io/
> [2]: https://en.wikipedia.org/wiki/Merkle_tree
>
> --
> This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
> For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

_______________________________________________
Discuss mailing list
[email protected]
http://lists.software-carpentry.org/mailman/listinfo/discuss_lists.software-carpentry.org

Reply via email to