On 6/10/17 5:00 AM, Honey wrote:
On Friday, 9 June 2017 at 23:10:28 UTC, Ali Çehreli wrote:
You would get the exact performance if you implemented e.g. with
pointers. Your test has been very valuable for exposing an
embarrassing performance issue. :)

I can confirm that, after changing implementation to the following,
performance is on par. It is not immediatetly clear to me how

 r[r[0 .. i].getTransitionIndex!Less(r[i]) .. i + 1]

would look like if written idiomatically.

I wrote it like this, which confirms that it's indeed bringToFront (I tried your getTransitionIndex function, but that didn't change timings at all):

void insertionSort(alias Less, Range)(Range r)
if (hasLength!Range && isRandomAccessRange!Range && hasSlicing!Range)
{
    foreach (immutable i; 1 .. r.length)
    {
auto ubElem = i - r[0 .. i].assumeSorted!Less.upperBound(r[i]).length;
        r[ubElem .. i+1].rotateRight;
    }
}

On my mac, it's completing in about 4.4 seconds, whereas the C++ version completes in around 3.

Optimized your rotateRight a bit to get better performance when we can use memmove:

void rotateRight(Range)(Range r)
    if (hasLength!Range && isRandomAccessRange!Range && hasSlicing!Range)
{
    auto t = r[$ - 1];
    static if(isDynamicArray!Range)
    {
        import core.stdc.string;
        memmove(r.ptr + 1, r.ptr, (r.length - 1) * r[0].sizeof);
    }
    else
    {
        foreach_reverse (i; 0 .. r.length - 1)
        {
            r[i + 1] = r[i];
        }
    }
    r[0] = t;
}


Brings it to exactly c++ performance :)

I'll update the bug report.

-Steve

Reply via email to