What about using a tuple of distributed vectors/arrays as table subclass, or using dagger for an out of core lazy array.
Then it can be loaded into a distributed array for linear algebra. On Thursday, September 29, 2016 at 4:33:21 AM UTC-4, Milan Bouchet-Valat wrote: > > We're not completely there yet, but with Query.jl and > StructuredQueries.jl, combined with JuliaDB/JuliaData packages, one > should be able to work on out-of-memory data sets as (or more) > efficiently as e.g. SAS. The high-level API is the same whether you > work on a DataFrame or on an external data base. > > There's also OnlineStats.jl for computing statistics without loading > the full data set in memory at once. > > > Regards > > > Le mercredi 28 septembre 2016 à 15:48 -0700, Juan a écrit : > > Yes, but you can only do simple things such as summaries or use > functions implemented on that special packages. You can do linear > regression, till now but you can't more complex things such as mixed > effect regression or use stan nor any other generic bayesian package. > > The same goes for Spark, you can only use predefined functions, very > simple ones, or create your own by hand, but it's very difficult that you > can program from scratch something like lme4. > > > > > > > Hi I don't know Julia, but in R you don't need to load all data > into memory just like SAS you can read off disk, in R both proprietary > Revolutionary Analytics R I think working with Hortonworks/Cloudera and > Hadoop and Yarn (I don't know if there is a Julia package for Yarn?, I know > little of Hadoop and [not really interested in Java ] and Yarn so I > suggest you contact someone at Hortonworks or Revolution R) g which I saw > a demonstration of in R User group here in Ottawa, Canada as well as > Revolution R's other proprietary methods and bigmemory > http://cran.r-project.org/web/packages/bigmemory/index.html and > http://www.bigmemory.org/ can handle more data. I Here is a discussion on > large size data. > > > https://groups.google.com/forum/#!topic/julia-stats/eqYT85_vUlg > > > Regards, > > > Ramesh > > > > > > > > > > > On Tue, Aug 5, 2014 at 10:42 AM, Michael Smith <[email protected]> > wrote: > > > > All, > > > > > > > > Are there currently any solutions in Julia to handle > larger-than-memory > > > > datasets in a similar way you do in a DataFrame? > > > > > > > > The reason I'm asking is that R has the limitation that you need to > fit > > > > all your data into memory. On the other hand, SAS (while being quite > > > > different) does not have this limitations. > > > > > > > > In the age of "big data" this can be quite an advantage. > > > > > > > > Of course, you can "patch" this situation, e.g. in R you can use the > ff > > > > or bigmemory packages, or use SQL. > > > > > > > > But my point is that it is bolted on, and you need to spend extra > mental > > > > loops switching between, say, data.frame and ff, instead of focusing > on > > > > your data problem at hand. This is a clear advantage of SAS, where > you > > > > don't have to do that. So I'm wondering how this is handled in > Julia. > > > > > > > > Thanks, > > > > > > > > M > > > > > > > > P.S.: I do not intend to start a flame war, e.g. whether R or SAS or > > > > Julia is better. I'm just interested to find out whether such a > solution > > > > exists in Julia (I haven't found any, but maybe I overlooked > something). > > > > And if no such solution exists, given that Julia is still young, > > > > evolving, and malleable (in a positive sense), it might make sense > to > > > > think about it. > > > > > > > > -- > > > > You received this message because you are subscribed to the Google > Groups "julia-stats" group. > > > > > > > To unsubscribe from this group and stop receiving emails from > it, send an email to [email protected]. > > > > > > > For more options, visit https://groups.google.com/d/optout. > > > > > > > > > > > > -- > > You received this message because you are subscribed to the Google > Groups "julia-stats" group. > > > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected] <javascript:>. > > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
