https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53100
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|middle-end |tree-optimization Status|UNCONFIRMED |NEW Last reconfirmed| |2021-08-06 Keywords| |missed-optimization Ever confirmed|0 |1 --- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> --- So in the #if 0 case we get: x_13 = (long int) a_12(D); y_15 = (long int) b_14(D); z_17 = (long int) c_16(D); t_19 = (long int) d_18(D); u_21 = (long int) e_20(D); v_23 = (long int) f_22(D); _1 = z_17 - x_13; _3 = v_23 - y_15; _5 = _1 w* _3; Notice the w* While with the original case we get: x_9 = (__int128) a_8(D); y_11 = (__int128) b_10(D); z_13 = (__int128) c_12(D); t_15 = (__int128) d_14(D); u_17 = (__int128) e_16(D); v_19 = (__int128) f_18(D); _1 = z_13 - x_9; _2 = v_19 - y_11; _3 = _1 * _2; If we had simplified/shortened: ((__int128) c_12(D)) - ((__int128) a_8(D)) to (__int128)(long)((unsigned long) c_12(D)) - ((unsigned long) a_8(D)) We might have optimized this. There might be another bug about having some PLUS_EXPR which has WRAPPING effects rather than undefined OVERFLOW which will help the casting issue.