Hi all,I have large compressed text tab delimited files,
I am trying to write efficient function to read them,
I am using gzfile() and readLines()
zz <- gzfile("exampl.txt.gz", "r") # compressed file
system.time(temp1<-readLines(zz ))
close(zz)
which work fast, and create vector of strings.
The problem is to parse the result, if I use strsplit it takes longer then
decompress file manually , read it with scan and erase it.
Can anybody recommend an efficient way of parsing large vector ~200,000
entries
Dmitriy
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.