Re: stride in slices

Meta via Digitalmars-d Mon, 04 Jun 2018 20:16:46 -0700

On Monday, 4 June 2018 at 23:08:17 UTC, Ethan wrote:

On Monday, 4 June 2018 at 18:11:47 UTC, Steven Schveighofferwrote:
BTW, do you have cross-module inlining on?
Just to drive this point home.

https://run.dlang.io/is/nrdzb0
Manually implemented stride and fill with everything forcedinline. Otherwise, the original code is unchanged.
17 ms, 891 μs, and 6 hnsecs
15 ms, 694 μs, and 1 hnsec
15 ms, 570 μs, and 9 hnsecs
My new stride outperformed std.range stride, and the manualfor-loop. And, because the third test uses the new stride, italso benefited. But interestingly runs every so slightlyfaster...


Just as an aside:

    ...

pragma( inline ) @property length() const { returnrange.length / strideCount; }pragma( inline ) @property empty() const { return currFront >range.length; }pragma( inline ) @property ref Elem front() { return range[currFront ]; }

    pragma( inline ) void popFront() { currFront += strideCount; }
    ...

    pragma( inline ) auto stride( Range )( Range r, int a )
    ...

    pragma( inline ) auto fill( Range, Value )( Range r, Value v )
    ...

pragma(inline), without any argument, does not force inlining. Itactually does nothing; it just specifies that the"implementation's default behaviour" should be used. You have toannotate with pragma(inline, true) to force inlining(https://dlang.org/spec/pragma.html#inline).

When I change all the pragma(inline) to pragma(inline, true),there is a non-trivial speedup:


14 ms, 517 μs, and 9 hnsecs
13 ms, 110 μs, and 1 hnsec
13 ms, 199 μs, and 9 hnsecs

There's further reductions using ldc-beta:

14 ms, 520 μs, and 4 hnsecs
13 ms, 87 μs, and 2 hnsecs
12 ms, 938 μs, and 8 hnsecs

Re: stride in slices

Reply via email to