On 3/23/15 10:43 AM, rumbu wrote:
On Monday, 23 March 2015 at 15:00:07 UTC, John Colvin wrote:
What would be really great would be a performance test suite for
phobos. D is reaching a point where "It'll probably be fast because we
did it right" or "I remember it being fast-ish 3 years ago when i
wrote a small toy test" isn't going to cut it. Real data is needed,
with comparisons to other languages where possible.
I made the same test in C# using a 30MB plain ASCII text file. Compared
to fastest method proposed by Andrei, results are not the best:
D:
readText.representation.count!(c => c == '\n') - 428 ms
byChunk(4096).joiner.count!(c => c == '\n') - 1160 ms
C#:
File.ReadAllLines.Length - 216 ms;
Win64, D 2.066.1, Optimizations were turned on in both cases.
The .net code is clearly not performance oriented
(http://referencesource.microsoft.com/#mscorlib/system/io/file.cs,675b2259e8706c26),
I suspect that .net runtime is performing some optimizations under the
hood.
At this point it gets down to the performance of std.algorithm.count,
which could and should be improved. This code accelerates speed 2.5x
over count and brings it in the zone of wc -l, which is probably near
the lower bound achievable:
auto bytes = args[1].readText.representation;
for (auto p = bytes.ptr, lim = p + bytes.length;; )
{
import core.stdc.string;
auto r = cast(immutable(ubyte)*) memchr(p, '\n', lim - p);
if (!r) break;
++linect;
p = r + 1;
}
Would anyone want to put some work into accelerating count?
Andrei