------- Comment #5 from rguenth at gcc dot gnu dot org 2009-06-24 11:23 ------- There are aliasing issues with your code / the intrinsics implementation:
float data[4] = {1, 2, 3, 4}; ... r = _mm_castpd_ps(_mm_load_sd((double*)(from))); ends up loading from float data via a pointer to double. That is invalid unless the intrinsic specifies it should work ok in which case it better had implemented counter-measures here: data[0] ={v} 1.0e+0; data[1] ={v} 2.0e+0; data[2] ={v} 3.0e+0; data[3] ={v} 4.0e+0; from.92_14 = (const double *) &data[0]; D.25303_15 = *from.92_14; D.25304_16 = {D.25303_15, 0.0}; r_17 = VIEW_CONVERT_EXPR<__m128>(D.25304_16); I would suggest using typedef double __attribute__((may_alias)) aliased_double; r = _mm_castpd_ps(_mm_set_sd (*(aliased_double *)from)); instead. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40537