Re: My ACCU 2016 keynote video available online

Jens Müller via Digitalmars-d-announce Thu, 19 May 2016 01:16:13 -0700

On Monday, 16 May 2016 at 13:46:11 UTC, Andrei Alexandrescu wrote:

Uses D for examples, showcases Design by Introspection, andrediscovers a fast partition routine. It was quite wellreceived. https://www.youtube.com/watch?v=AxnotgLql0k
Andrei


Nice presentation.

The code applying the sentinel optimization assumes mutability ofthe input.That needs to be checked for. That's fine for partition becausethat is assumedto be in-place. But for other algorithms it's not so obvious.It's sad that theoptimization works only for non-const input. It is in conflictwith the adviceto make input const if the function doesn't change it. This makestheoptimization less likely to be applicable. One might though relaxthe constrequirement to mean "the input is identical at return of thefunction to itsbeginning". But that's a different story, I'll guess. Coming upwith anotherimplementation might also work, using chain or so. But typicallythe sentinel

optimization assumes mutability.

I didn't get the idea behind sentinels for sparse dot product. Ipicked thesmallest of the last elements (so you need bidirectional ranges)and fix up asneeded. For gdc I get a speedup (baseline over newimplementation) of 1.2 inbest case and >1.0 in worst case. On average it's about 1.1 Iwould say. Iexpected more. How would you approach sentinels with the sparsedot product. Can

you elaborate the idea from the video? I didn't get it.

The base line (dot1 in the graphs) is the straightforward version

---
size_t i,j = 0;
double sum = 0;
while (i < a.length && j < b.length)
{
    if (a[i].index < b[j].index) i++;
    else if (a[i].index > b[j].index) j++;
    else
    {
        assert(a[i].index == b[j].index);
        sum += a[i].value * b[j].value;
        i++;
        j++;
    }
}
return sum;
---

BTW the effects vary greatly for different compilers.

For example with dmd the optimized version is slowest. Thebaseline isbest. Weird. With gdc the optimized is best and gdc's code isalwaysfaster than dmd's code. With ldc it's really strange. Slower thandmd. I

assume I'm doing something wrong here.

Used compiler flags
dmd v2.071.0
-wi -dw -g -O -inline -release -noboundscheck
gdc (crosstool-NG 203be35 - 20160205-2.066.1-e95a735b97) 5.2.0

-Wall -g -O3 -fomit-frame-pointer -finline-functions -frelease-fno-bounds-check -ffast-math

ldc (0.15.2-beta2) based on DMD v2.066.1 and LLVM 3.6.1
-wi -dw -g -O3 -enable-inlining -release -boundscheck=off

Am I missing some flags?

I uploaded my plots.
- running time https://www.scribd.com/doc/312951947/Running-Time
- speed up https://www.scribd.com/doc/312951964/Speedup

*Disclaimer*
I hope most of this makes sense but take it with a grain of salt.

Jens

PS

It seems the mailinglist interface does not work. I cannot sendreplies anymore via mail. I wrote Brad Roberts but no answer yet.

Re: My ACCU 2016 keynote video available online

Reply via email to