Re: Speed of csvReader

Jon D via Digitalmars-d-learn Thu, 21 Jan 2016 14:10:35 -0800

On Thursday, 21 January 2016 at 09:39:30 UTC, data pulverizerwrote:

I have been reading large text files with D's csv file readerand have found it slow compared to R's read.table functionwhich is not known to be particularly fast.

FWIW - I've been implementing a few programs manipulatingdelimited files, e.g. tab-delimited. Simpler than CSV filesbecause there is no escaping inside the data. I've been trying todo this in relatively straightforward ways, e.g. using byLinerather than byChunk. (Goal is to explore the power of D standardlibraries).


I've gotten significant speed-ups in a couple different ways:
* DMD libraries 2.068+  -  byLine is dramatically faster

* LDC 0.17 (alpha) - Based on DMD 2.068, and faster than theDMD compiler* Avoid utf-8 to dchar conversion - This conversion often occurssilently when working with ranges, but is generally not neededwhen manipulating data.* Avoid unnecessary string copies. e.g. Don't gratuitouslyconvert char[] to string.

At this point performance of the utilities I've been writing isquite good. They don't have direct equivalents with other tools(such as gnu core utils), so a head-to-head is not appropriate,but generally it seems the tools are quite competitive withoutneeding to do my own buffer or memory management. And, they aredramatically faster than the same tools written in perl (which Iwas happy with).


--Jon

Re: Speed of csvReader

Reply via email to