Hi datatablers,

Feedback and bug reports much appreciated :

=====
New function fread(), a fast and friendly file reader.
* header, skip, nrows, sep and colClasses are all auto detected.
* integers>2^31 are detected and read natively as bit64::integer64.
* accepts filenames, URLs and "A,B\n1,2\n3,4" directly
* new implementation entirely in C
* with a 50MB .csv, 1 million rows x 6 columns :
    read.csv("test.csv")                                   # 30-60 sec
    read.table("test.csv",<all known tricks, known nrows>) #    10 sec
    fread("test.csv")                                      #     3 sec
* airline data: 658MB csv (7 million rows x 29 columns)
    read.table("2008.csv",<all known tricks, known nrows>) #   360 sec
    fread("2008.csv")                                      #    50 sec
See ?fread. Many thanks to Chris Neff and Garrett See for ideas,
discussions and beta testing.
=====

1.8.7 is passing checks on Unix and Windows (but not Mac yet) :

  install.packages("data.table", repos="http://R-Forge.R-project.org";)
  require(data.table)
  ?fread
  fread("your biggest baddest file")

Oddly, R-Forge appears to be compiling Win64 with -O2 optimization rather than -O3 (but -O3 on Win32 ok), so speedups might not be as great on Win64
until that can be resolved on R-Forge, unless you compile yourself. -O3
has some optimizations that fread may benefit from. But interested to hear.

Seasons greatings!

Matthew


_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to