Speed of csvReader

data pulverizer via Digitalmars-d-learn Thu, 21 Jan 2016 01:40:56 -0800

I have been reading large text files with D's csv file reader andhave found it slow compared to R's read.table function which isnot known to be particularly fast. Here I am reading Fannie Maemortgage acquisition data which can be found herehttp://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html after registering:


D Code:


import std.algorithm;
import std.array;
import std.file;
import std.csv;
import std.stdio;
import std.typecons;
import std.datetime;

alias row_type = Tuple!(string, string, string, string, string,string, string, string,string, string, string, string, string,string, string, string,string, string, string, string, string,string);


void main(){
  StopWatch sw;
  sw.start();
  auto buffer = std.file.readText("Acquisition_2009Q2.txt");
  auto records = csvReader!row_type(buffer, '|').array;
  sw.stop();
  double time = sw.peek().msecs;
  writeln("Time (s): ", time/1000);
}

Time (s): 13.478

R Code:

system.time(x <- read.table("Acquisition_2009Q2.txt", sep = "|",colClasses = rep("character", 22)))

   user  system elapsed
  7.810   0.067   7.874

R takes about half as long to read the file. Both read the datain the "equivalent" type format. Am I doing something incorrecthere?

Speed of csvReader

Reply via email to