* Skylar Thompson ([email protected]) [100213 14:30]: > Michael Steeves wrote: > > R is memory intensive. It's not multithreaded, so you typically don't > > need the cores. We've got a system with 16 cores and 64G of RAM, and R > > jobs will routinely eat up all 64G of memory, and leave 15 CPUs > > sitting idle, at least on the Linux side of things. > > > > Whatever you get should have some space for memory expansion, even if > > you start it out small. > > While R doesn't have threading support, it does have a package that > provides MPI bindings that can make use of those extra CPUs. Obviously > you need to have a problem that is parallelizable to take advantage of it.
It's also got bindings for snow and network spaces. Unfortunately, in all cases you'll need to either figure out how to partition your data so that you can do your analysis against smaller sets of data (I'm pretty sure that's the way that things like randomForest get parallelized), or else rewrite the functions so that they can run in a distributed fashion. -Mike -- Michael Steeves ([email protected]) Key fingerprint = 5DF4 CA31 1C6E 0C2E 60EA 704C 31A9 0B28 3AF0 2699 _______________________________________________ Tech mailing list [email protected] http://lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/
