Prof Brian Ripley wrote: > On Tue, 15 May 2007, Lorenzo Isella wrote: > > >> Dear All, >> Hope I am not bumping into a FAQ, but so far my online search has been >> fruitless >> I need to read some data file using R. I am using the (I think) >> standard command: >> >> data_150<-read.table("y_complete06000", header=FALSE) >> >> where y_complete06000 is a 6000 by 40 table of numbers. >> I am puzzled at the fact that R is taking several minutes to read this file. >> First I thought it may have been due to its shape, but even >> re-expressing and saving the matrix as a 1D array does not help. >> It is not a small file, but not even huge (it amounts to about 5Mb of >> text file). >> Is there anything I can do to speed up the file reading? >> > > You could try reading the help page or the 'R Data Import/Export' manual. > Both point out things like > > 'read.table' is not the right tool for reading large matrices, > especially those with many columns: it is designed to read _data > frames_ which may have columns of very different classes. Use > 'scan' instead. > > On the other hand I am surprised at several minutes, but as you haven't > even told us your OS, it is hard to know what to expect. My Linux box > took 3 secs for a 6000x40 matrix with read.table, 0.8 sec with scan. > > If it is 40 rows and 6000 columns, then it might explain it:
> x <- as.data.frame(matrix(rnorm(40*6000),6000)) > write.table(x,file="xx.txt") > system.time(y <- read.table("xx.txt")) user system elapsed 1.229 0.007 1.250 > write.table(t(x),file="xx.txt") > system.time(y <- read.table("xx.txt")) user system elapsed 92.986 0.188 93.912 However, this is still not _several_ minutes, and it is on my laptop which is not particularly fast. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.