[EMAIL PROTECTED] writes: > Hello, > > at the moment I am doing quite a lot of regression, especially > logistic regression, on 20000 or more records with 30 or more > factors, using the "step" function to search for the model with the > smallest AIC. This takes a lot of time on this 1.8 GHZ Pentium > box. Memory does not seem to be such a big problem; not much > swapping is going on and CPU usage is at or close to 100%. What > would be the most cost-effective way to speed this up? The > obvious way would be to get a machine with a faster processor (3GHz > plus) but I wonder whether it might instead be better to run a dual- > processor machine or something like that; this looks at least like a > problem R should be able to parallelise, though I don't know whether it > does.
Is this floating point bound? (When you say 30 factors does that mean 30 parameters or factors representing a much larger number of groups). If it is integer bound, I don't think you can do much better than increase CPU speed and - note - memory bandwidth (look for large-cache systems and fast front-side bus). To increase floating point performance, you might consider the option of using optimized BLAS (see the Windows FAQ 8.2 and/or the "R Installation and Administration" manual) like ATLAS; this in turn may be multithreaded and make use of multiple CPUs or multi-core CPUs. -- O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
