http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55829



Andrew Pinski <pinskia at gcc dot gnu.org> changed:



           What    |Removed                     |Added

----------------------------------------------------------------------------

             Status|UNCONFIRMED                 |NEW

   Last reconfirmed|                            |2013-01-04

     Ever Confirmed|0                           |1



--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> 2013-01-04 
03:38:40 UTC ---

Confirmed here is a more reduced testcase:

extern double p2[];

extern double ck[];

int chk_pd(void);

int sse3_test (void)

{

  int i = 0;

  int fail = 0;

  __m128d t1 = (__m128d){*p2, 0};

  __m128d t2 = __builtin_ia32_shufpd (t1, t1, 0);

  double p10 = p2[0];

  for (; i < 80; i += 1)

    {

    ck[0] = p10;

    __builtin_ia32_storeupd (p2, t2);

    fail += chk_pd ();

    }

}



--- CUT ---

Note the first difference with -fno-expensive-optimizations is the ira dump. 

Also note if we change t1/t2 into:

  __m128d t2 = (__m128d){*p2, *p2};

It works.  The difference between those two are:

(insn 17 13 7 2 (set (reg/v:V2DF 65 [ t2 ])

        (vec_concat:V2DF (reg:DF 80 [ D.1764 ])

            (reg:DF 80 [ D.1764 ]))) t6.c:11 1467 {*vec_concatv2df}

     (nil))



(insn 10 9 5 2 (set (reg/v:V2DF 63 [ t2 ])

        (vec_duplicate:V2DF (reg:DF 62 [ D.1756 ]))) t6.c:9 1466 {vec_dupv2df}

     (nil))



Note both of those two RTL are the exactly the same, maybe we should convert

the vec_concat of the same value into vec_duplicate but that is a different

issue all together and would make this ICE latent.

Reply via email to