I agree, we need a platform able to work transparently with data of any size, at least bigger than memory, and ideally distributed on several computers.
Solutions such as Spark are not complete, they only offer the basic functionalities to build something else. We don't just need to be able to get some summaries, as we do with databases, we need to be able to do all operations with big data, operations such as multiplying to big matrixes, fitting a mixed-effect models, MCMC, etc. Bigmemory or ff allows you to do some simple things but they cannot be used by other packages like lme4. -- You received this message because you are subscribed to the Google Groups "julia-stats" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
