Dear D-ers,

I enjoyed reading some details of incorporating AVX into math code
from Johan Engelen's programming blog post:

http://johanengelen.github.io/ldc/2016/10/11/Math-performance-LDC.html

Basically, one can use the ldc compiler to insert avx code, nice!

In playing with some variants of his example code, I realize
that there are issues I do not understand. For example, the following
code successfully incorporates the avx instructions:

```d
// File here is called dotFirst.d
import ldc.attributes : fastmath;
@fastmath

double dot( double[] a, double[] b)
{
    double s = 0.0;
    foreach (size_t i; 0 .. a.length) {
        s += a[i] * b[i];
    }
    return s;
}

double[8] x =[0.0, 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, ];
double[8] y =[0.0, 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, ];

void main()
{
    double z = 0.0;
    z = dot(x, y);
}
```
If we run:

ldc2 -c -output-s -O3 -release dotFirst.d -mcpu=haswell
echo "Results of grep ymm dotFirst.s:"
grep ymm dotFirst.s

The "grep" shows a number of vector instructions, such as:

**vfmadd132pd     160(%rcx,%rdi,8), %ymm5, %ymm1**

However, subtle changes in the code (such as moving the dot product function to a module, or even moving the array declarations to before
the dot product function, and the avx instructions will disappear!

```d
import ldc.attributes : fastmath;
@fastmath

double[8] x =[0.0, 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, ];
double[8] y =[0.0, 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, ];

double dot( double[] a, double[] b)
{
    double s = 0.0;
    foreach (size_t i; 0 .. a.length) {
...

```
Now a grep will not find a single **ymm**.

It is understood that ldc needs proper alignment to be able to do the vector
instructions...

**But my question is:** how is proper alignment guaranteed? (Most importantly how guaranteed among code using modules)?? (There are related stack alignment
issues -- 16?)

Best Regards,
James


PS I have come across scattered bits of (sometimes contradictory) information on
avx/simd for dlang.  Is there a canonical source for vector info?





Reply via email to