On Wednesday, 12 October 2016 at 13:53:03 UTC, Andrei Alexandrescu wrote:
So we've had a good run with making popFront smaller. In ASCII microbenchmarks with ldc, the speed is indistinguishable from s = s[1 .. $]. Smaller functions make sure that the impact on instruction cache in larger applications is not high.

Now it's time to look at the end-to-end cost of autodecoding. I wrote this simple microbenchmark:

=====
import std.range;

alias myPopFront = std.range.popFront;
alias myFront = std.range.front;

void main(string[] args) {
    import std.algorithm, std.array, std.stdio;
    char[] line = "0123456789".dup.repeat(50_000_000).join;
    ulong checksum;
    if (args.length == 1)
    {
        while (line.length) {
            version(autodecode)
            {
                checksum += line.myFront;
                line.myPopFront;
            }
            else
            {
                checksum += line[0];
                line = line[1 .. $];
            }
        }
        version(autodecode)
            writeln("autodecode ", checksum);
        else
            writeln("bytes ", checksum);
    }
    else
        writeln("overhead");
}
=====

On my machine, with "ldc2 -release -O3 -enable-inlining" I get something like 0.54s overhead, 0.81s with no autodecoding, and 1.12s with autodecoding.

Your mission, should you choose to accept it, is to define a combination front/popFront that reduces the gap.


Andrei

This will only work really efficiently with some state on the stack.
If we are to support Unicode.

Reply via email to