[Bug middle-end/80399] Premature optimization with unsigned

2021-12-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80399

Andrew Pinski  changed:

   What|Removed |Added

  Known to work||10.2.0, 12.0, 9.3.0
  Known to fail||7.3.0, 8.5.0
   Keywords||needs-bisection
   Severity|normal  |enhancement

--- Comment #4 from Andrew Pinski  ---
Seems to have been fixed in GCC 9.

[Bug middle-end/80399] Premature optimization with unsigned

2017-04-12 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80399

wilco at gcc dot gnu.org changed:

   What|Removed |Added

 CC||wilco at gcc dot gnu.org

--- Comment #3 from wilco at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #2)
> Related case (but I know it goes down a different path) is:
> struct ss
> {
>   int aa;
>   int s;
> };
> 
> int
> f(int a, struct ss *rn, int i)
> {
>   return rn[i-1].s == a;
> }
> 
> Which shows up in SPEC INT.

This works for me (unlike similar cases where we fail to use loads with
offsets):

add x1, x1, x2, sxtw 3
ldr w1, [x1, -4]
cmp w1, w0
csetw0, eq
ret

That's the best possible code as when you use a[i-1], a[i], a[i+1] you don't
want 3 address computations but a single shared one and loads/stores with
simple immediate offsets.

[Bug middle-end/80399] Premature optimization with unsigned

2017-04-12 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80399

--- Comment #2 from Andrew Pinski  ---
Related case (but I know it goes down a different path) is:
struct ss
{
  int aa;
  int s;
};

int
f(int a, struct ss *rn, int i)
{
  return rn[i-1].s == a;
}

Which shows up in SPEC INT.

[Bug middle-end/80399] Premature optimization with unsigned

2017-04-12 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80399

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-04-12
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Huh, sth folds (t + 4294967295) * 16 to (t + 268435455) * 16.

Certainly "interesting".  I suppose we're trying to make constants cheaper
but that's sth quite premature in this particular case.

Looks like it's iterating extract_muldiv doing t * 16 + 4294967295 * 16 and
fold_plusminus_mult_expr undoing it until it arrives at a point where
extract_muldiv iteration gives up.