[julia-users] Re: What is really "big data" for Julia (or otherwise), 1D or multi-dimensional?

Scott Jones Mon, 17 Oct 2016 03:05:27 -0700


On Friday, October 14, 2016 at 1:00:35 PM UTC-4, Páll Haraldsson wrote:
>
> On Thursday, October 13, 2016 at 7:49:51 PM UTC, cdm wrote:
>>
>> from CloudArray.jl:
>>
>> "If you are dealing with big data, i.e., your RAM memory is not enough 
>> to store your data, you can create a CloudArray from a file."
>>
>>    
>> https://github.com/gsd-ufal/CloudArray.jl#creating-a-cloudarray-from-a-file
>>
>
> Good to know, and seems cool.. (like CatViews.jl) indexes could need to be 
> bigger than 32-bit this way.. even for 2D.
>
> But has anyone worked with more than 70 terabyte arrays, that would 
> otherwise have been a limitation?
>


In my previous life, yes, all the time, they were n dimensional sorted 
associative array structures, stored on disk, cached in shared memory for 
all processes to be able to access simultaneously (but that was in 
CachéObjectScript, not Julia).

This is something we (Dynactionize.com, the Belgian startup that I'm at) 
are working on being able to easily do in Julia, using Aerospike and CEPH 
to actually handle having the data stored in a distributed fashion on SSDs 
and disk, but it may be a while before we reach the sorts of scale in Julia 
that I'd been accustomed to with data in Caché (handling healthcare data, 
bank and insurance company data, sensor data, all sorts of different 
applications, even the ESA satellite data).

Anyone know biggest (or just big over 2 GB) one-dimensional array people 
> are working with?
>

[julia-users] Re: What is really "big data" for Julia (or otherwise), 1D or multi-dimensional?

Reply via email to