On Wednesday, 23 March 2016 at 18:10:05 UTC, ParticlePeter wrote:
Thanks Simen,
your tokenCounter is inspirational, for the rest I'll take some time for testing.

My pleasure. :) Testing it on your example data shows it to work there. However, as stated above, the documentation says it's undefined, so future changes (even optimizations and bugfixes) to Phobos could make it stop working:

"This predicate must be an equivalence relation, that is, it must be reflexive (pred(x,x) is always true), symmetric (pred(x,y) == pred(y,x)), and transitive (pred(x,y) && pred(y,z) implies pred(x,z)). If this is not the case, the range returned by chunkBy may assert at runtime or behave erratically."

But some additional thoughts from my sided:
I get all the lines of the file into one range. Calling array on it should give me an array, but how would I use find to get an index into this array? With the indices I could slice up the array into four slices, no allocation required. If there is no easy way to just get an index instead of an range, I would try to use something like the tokenCounter to find all the indices.

The chunkBy example should not allocate. chunkBy itself is lazy, as are its sub-ranges. No copying of string contents is performed. So unless you have very specific reasons to use slicing, I don't see why chunkBy shouldn't be good enough.

Full disclosure:
There is a malloc call in RefCounted, which is used for optimization purposes when chunkBy is called on a forward range. When chunkBy is called on an array, that's a 6-word allocation (24 bytes on 32-bit, 48 bytes on 64-bit), happening once. There are no other dependencies that allocate.

Such is the beauty of D. :)

--
  Simen

Reply via email to