I'm trying to read in datasets with roughly 150,000 rows and 600 features. I wrote a function using scan() to read it in (I have a 4GB linux machine) and it works like a charm. Unfortunately, converting the scanned list into a datafame using as.data.frame() causes the memory usage to explode (it can go from 300MB for the scanned list to 1.4GB for a data.frame of 30000 rows) and it fails claiming it cannot allocate memory (though it is still not close to the 3GB limit per process on my linux box - the message is "unable to allocate vector of size 522K").
So I have three questions -- 1) Why is it failing even though there seems to be enough memory available? 2) Why is converting it into a data.frame causing the memory usage to explode? Am I using as.data.frame() wrongly? Should I be using some other command? 3) All the model fitting packages seem to want to use data.frames as their input. If I cannot convert my list into a data.frame what can I do? Is there any way of getting around this? Much thanks! Nawaaz ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
