On Sun, Feb 26, 2012 at 09:39:46AM -0800, Rui Barradas wrote:
> Hello,
>
> > The first step before to create a loop row-by-row is to know
> > how many rows there are in the txt file without load in R to save memory
> > problem.
> >
> > some people know the specific function?
> >
>
> I don't believe there's a specific function.
As stated, OP does not need to know the number of lines in the file to
solve the problem. However, if you want to know that, I'd suggest the
command wc rather than writing a function in R to accomplish this.
wc is also part of GNU coreutils
$ wc -l foo.csv
1138200 foo.csv
> If you want to know how many rows are there in a txt file, try this
> function.
>
> numTextFileLines <- function(filename, header=FALSE, sep=",", nrows=5000){
> tc <- file(filename, open="rt")
> on.exit(close(tc))
> if(header){
> # cnames: column names (not used)
> cnames <- read.table(file=tc, sep=sep, nrows=1,
> stringsAsFactors=FALSE)
> # cnames <- as.character(cnames)
> }
> n <- 0
> while(TRUE){
> x <- tryCatch(read.table(file=tc, sep=sep, nrows=nrows),
> error=function(e)
> e)
> if (any(grepl("no lines available", unclass(x))))
> break
> if(nrow(x) < nrows){
> n <- n + nrow(x)
> break
> }
> n <- n + nrows
> }
> n
> }
But hey, programming R is fun, so why not?
--
Hans Ekbrand
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.