On Friday, October 14, 2016 at 1:00:35 PM UTC-4, Páll Haraldsson wrote: > > On Thursday, October 13, 2016 at 7:49:51 PM UTC, cdm wrote: >> >> from CloudArray.jl: >> >> "If you are dealing with big data, i.e., your RAM memory is not enough >> to store your data, you can create a CloudArray from a file." >> >> >> https://github.com/gsd-ufal/CloudArray.jl#creating-a-cloudarray-from-a-file >> > > Good to know, and seems cool.. (like CatViews.jl) indexes could need to be > bigger than 32-bit this way.. even for 2D. > > But has anyone worked with more than 70 terabyte arrays, that would > otherwise have been a limitation? >
In my previous life, yes, all the time, they were n dimensional sorted associative array structures, stored on disk, cached in shared memory for all processes to be able to access simultaneously (but that was in CachéObjectScript, not Julia). This is something we (Dynactionize.com, the Belgian startup that I'm at) are working on being able to easily do in Julia, using Aerospike and CEPH to actually handle having the data stored in a distributed fashion on SSDs and disk, but it may be a while before we reach the sorts of scale in Julia that I'd been accustomed to with data in Caché (handling healthcare data, bank and insurance company data, sensor data, all sorts of different applications, even the ESA satellite data). Anyone know biggest (or just big over 2 GB) one-dimensional array people > are working with? >