https://github.com/dlang/dmd/pull/6176

I'm happy to report that DMD has (finally!) gotten some significant new optimizations! Specifically, 'slicing' a two register wide aggregate into two register-sized variables, enabling much better enregistering.

Given the code:

void foo(int[] a, int[] b, int[] c) {
    foreach (i; 0 .. a.length)
        a[i] = b[i] + c[i];
}

the inner loop formerly compiled to:

LA:             mov     EAX,018h[ESP]
                mov     EDX,010h[ESP]
                mov     ECX,[EBX*4][EAX]
                add     ECX,[EBX*4][EDX]
                mov     ESI,020h[ESP]
                mov     [EBX*4][ESI],ECX
                inc     EBX
                cmp     EBX,01Ch[ESP]
                jb      LA
and now:

L1A:            mov     ECX,[EBX*4][EDI]
                add     ECX,[EBX*4][ESI]
                mov     0[EBX*4][EBP],ECX
                inc     EBX
                cmp     EBX,EDX
                jb      L1A

I've been wanting to do this for years, and finally got around to it. (I also thought of a simpler way to implement it, which helped a lot.)

Further work will be in widening what this applies to.

Reply via email to