On Sunday, 4 June 2017 at 15:59:03 UTC, Jesse Phillips wrote:
On Sunday, 4 June 2017 at 06:15:24 UTC, H. S. Teoh wrote:
[...]
Ok, I took you up on that, I'm still skeptical:
LDC2 -O3 -release -enable-cross-module-inlining
std.csv: 12487 msecs
fastcsv (no gc): 1376 msecs
csvslicing: 3039 msecs
That looks like about 10 times faster to me. Using the slicing
version failed because of \r\n line endings (guess multi-part
separators is broken) I changed the data file so I could get
the execution time.
Anyway, I'm not trying to claim fastcsv isn't good at what it
does, all I'm trying to point out is std.csv is doing more work
than these faster csv parsers. And I don't even want to claim
that std.csv is better because of that work, it actually
appears that it was a mistake to do validation.
In case you have time, it would be very interesting to compare it
with other state of the art tools like paratext:
http://www.wise.io/tech/paratext