First of all, thank you very much for creating, maintaining and updating this package! Discovering "fread" and the data.table package have made my life a lot easier.
I'm using fread to read large (2-4Gb) .CSV files for subsequent RMySQL bulkloads, and (since the computer I use is a bit memory limited) decided to read it in chunks, using skip and nrows. I'm noticing that as I go through the file (with a for loop) each individual read takes on average a bit longer (as I'm guessing fread parses through the file line by line to reach the skip to location). Is there any way to make fread "remember" the end of the last read location for the next iteration? It would speed up my reads from minutes to seconds, I would guess. Also, should I worry that reusing the same data.table in a for loop causes memory issues? Many thanks, Serban Tanasa, Ph.D. Senior Analyst Latinum Network (o) (240) 482-8259 (f) (240) 482-8265 -- View this message in context: http://r.789695.n4.nabble.com/Fread-Skip-Question-tp4690205.html Sent from the datatable-help mailing list archive at Nabble.com. _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
