Re: Why is D slower than LuaJIT?

Andrei Alexandrescu Thu, 23 Dec 2010 07:25:39 -0800

On 12/22/10 4:04 PM, Andreas Mayer wrote:

To see what performance advantage D would give me over using a scripting 
language, I made a small benchmark. It consists of this code:

    auto L = iota(0.0, 10000000.0);
    auto L2 = map!"a / 2"(L);
    auto L3 = map!"a + 2"(L2);
    auto V = reduce!"a + b"(L3);


It runs in 281 ms on my computer.

The same code in Lua (using LuaJIT) runs in 23 ms.

That's about 10 times faster. I would have expected D to be faster. Did I do 
something wrong?

The first Lua version uses a simplified design. I thought maybe that is unfair 
to ranges, which are more complicated. You could argue ranges have more 
features and do more work. To make it fair, I made a second Lua version of the 
above benchmark that emulates ranges. It is still 29 ms fast.

The full D version is here: http://pastebin.com/R5AGHyPx
The Lua version: http://pastebin.com/Sa7rp6uz
Lua version that emulates ranges: http://pastebin.com/eAKMSWyr

Could someone help me solving this mystery?

Or is D, unlike I thought, not suitable for high performance computing? What 
should I do?

I wrote a new test bench and got 41 ms for the baseline and 220 ms forthe code based on map and iota. (Surprisingly, the extra work didn'taffect the run time, which suggests the loop is dominated by the counterincrement and test.) Then I took out the cache in map and got 136 ms.Finally, I replaced the use of iota with iota2 and got performance equalto that of handwritten code. Code below.

I decided to check in the map cache removal. We discussed it a fairamount among Phobos devs. I have no doubts caching might help in certaincases, but it does lead to surprising performance loss for simple caseslike the one tested here. Seehttp://www.dsource.org/projects/phobos/changeset/2231

If the other Phobos folks approve, I'll also specialize iota forfloating point numbers to be a forward range and defer the decision ondefining a "randomAccessIota" for floating point numbers to later. Thatwould complete the library improvements pointed to by this test, leavingfurther optimization to compiler improvements. Thanks Andreas forstarting this.



Andrei

import std.algorithm;
import std.stdio;
import std.range;
import std.traits;

struct Iota2(N, S) if (isFloatingPoint!N && isNumeric!S) {
    private N start, end, current;
    private S step;
    this(N start, N end, S step)
    {
        this.start = start;
        this.end = end;
        this.step = step;
        current = start;
    }
    /// Range primitives
    @property bool empty() const { return current >= end; }
    /// Ditto
    @property N front() { return current; }
    /// Ditto
    alias front moveFront;
    /// Ditto
    void popFront()
    {
        assert(!empty);
        current += step;
    }
    @property Iota2 save() { return this; }
}

auto iota2(B, E, S)(B begin, E end, S step)
if (is(typeof((E.init - B.init) + 1 * S.init)))
{
    return Iota2!(CommonType!(Unqual!B, Unqual!E), S)(begin, end, step);
}

void main(string args[]) {
     double result;
     auto limit = 10_000_000.0;
     if (args.length > 1) {
        writeln("iota");
        auto L = iota2(0.0, limit, 1.0);
        auto L2 = map!"a / 2"(L);
        auto L3 = map!"a + 2"(L2);
        result = reduce!"a + b"(L3);
    } else {
        writeln("baseline");
        result = 0.0;
        for (double i = 0; i != limit; ++i) {
            result += (i / 2) + 2;
        }
    }
    writefln("%f", result);
}

Re: Why is D slower than LuaJIT?

Reply via email to