On Wed, Sep 7, 2011 at 11:33 AM, Zheng, Xin (NIH) [C] <[email protected]> wrote: > I am from R and I've got tired of its performance. So I am looking for some > other language. I wonder how the performance of J when analyzing big data(GB > or even TB). Could anyone give an rough idea?
The answer depends on your machine and your computations and your operating system. For a very rough first approximation assume a factor of 5 overhead on data structure being manipulated (since you typically need intermediate results). And assume that if your calculation requires swap you will need a factor of 1000 extra time (though how slow depends on how swap is implemented on your machine). For large calculations I usually like breaking things up into blocks that easily fit into memory (blocks of 1e6 data elements often works fine). You will probably want to use memory mapped files for large data structures. I have never tried TB files in J. You may want to consider kx.com's interpreter instead of J if you routinely work on that size of data -- their user community routinely works on very large data sets. Expect to pay a lot of money, though, if you go that route. -- Raul ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
