Beside the mismatched quotes, another I had with a file is some illegal characters (0x1A in this case) signaled an end of read when reading.
On Wed, Oct 6, 2010 at 3:15 AM, Earl F. Glynn <efgl...@gmail.com> wrote: > > I am trying to read a tab-delimited 1.25 GB file of 4,115,119 records each > with 52 fields. > > I am using R 2.11.0 on a 64-bit Windows 7 machine with 8 GB memory. > > I have tried the two following statements with the same results: > > d <- read.delim(filename, as.is=TRUE) > > d <- read.delim(filename, as.is=TRUE, nrows=4200000) > > I have tried starting R with this parameter but that changed nothing: > --max-mem-size=6GB > > Everything appeared to have worked fine until I studied frequency counts of > the fields and realized data were missing. > >> dim(d) > [1] 3388444 52 > > R read 3,388,444 records and missed 726,754 records. There were no error > messages or exceptions. I plotted a chart using the data and later > discovered not all the data were represented in the chart. > > R didn't just read the first 3,388,444 records and quit. > > Here's what I believe happened (based on frequency counts of the first field > in the data.frame from R, and independently from another source): > * R read the first 1,866,296 records and then skipped 419,340 records. > * Next, R read 1,325,552 records and skipped 307,414 records. > * R read the last 196,596 records without any problems. > > Questions: > > Is there some memory-related parameter that I should adjust that might > explain the observed details above? > > Shouldn't read.delim catch this failure instead of being silent about > dropping data? > > Thanks for any help with this. > > Earl F Glynn > Overland Park, KS > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.