I have found several ways to "fix" the latest issue, but they all boil
down to never passing an __m128d value on the call stack.  For instance
change

static __m128d
__attribute__((noinline, unused))
test (__m128d s1, __m128d s2)

to

static __m128d test (__m128d s1, __m128d s2)

and the program works.  Similarly, change the function to

 static __m128d __attribute__((noinline)) test (__m128d *s1, __m128d *s2)
{
  return _mm_add_pd (*s1, *s2); 
}

and it also works.

Things I tried to force a 16 byte stack alignment that didn't work:

1  -mstackrealign
2  -mpreferred-stack-boundary=4
3  -mincoming-stack-boundary=4
4  2 and 3
5  1 and 2 and 3

I guess the bigger question is why can an __m128d be passed on the call
stack reliably when -msse2 is invoked, but not otherwise?  If the
compiler cannot do this reliably shouldn't it throw an error or warning?

Thanks,

David Mathog
mat...@caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech

Reply via email to