I just took a look at making byLine faster. It took less than one evening:

https://github.com/D-Programming-Language/phobos/pull/3089

I confess I am a bit disappointed with the leadership being unable to delegate this task to a trusty lieutenant in the community. There's been a bug opened on this for a long time, it gets regularly discussed here (with the wrong conclusions ("we must redo D's I/O because FILE* is killing it!") about performance bottlenecks drawn from unverified assumptions), and the techniques used to get a marked improvement in the diff above are trivial fare for any software engineer. The following factors each had a significant impact on speed:

* On OSX (which I happened to test with) getdelim() exists but wasn't being used. I made the implementation use it.

* There was one call to fwide() per line read. I used simple caching (a stream's width cannot be changed once set, making it a perfect candidate for caching).

(As an aside there was some unreachable code in ByLineImpl.empty, which didn't impact performance but was overdue for removal.)

* For each line read there was a call to malloc() and one to free(). I set things up that the buffer used for reading is reused by simply making the buffer static.

* assumeSafeAppend() was unnecessarily used once per line read. Its removal led to a whopping 35% on top of everything else. I'm not sure what it does, but boy it does takes its sweet time. Maybe someone should look into it.

Destroy.


Andrei

Reply via email to