[Bug rtl-optimization/95862] Failure to optimize usage of __builtin_mul_overflow to small __int128-based check

2020-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95862

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:b13dacdfb315675803982ad5a3098f7b55e6357a

commit r11-5369-gb13dacdfb315675803982ad5a3098f7b55e6357a
Author: Jakub Jelinek 
Date:   Wed Nov 25 17:25:36 2020 +0100

testsuite: Rename test to avoid typo in its name [PR95862]

2020-11-25  Jakub Jelinek  

PR rtl-optimization/95862
* gcc.dg/builtin-artih-overflow-5.c: Renamed to ...
* gcc.dg/builtin-arith-overflow-5.c: ... this.

[Bug rtl-optimization/95862] Failure to optimize usage of __builtin_mul_overflow to small __int128-based check

2020-11-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95862

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Jakub Jelinek  ---
Fixed on the trunk.

[Bug rtl-optimization/95862] Failure to optimize usage of __builtin_mul_overflow to small __int128-based check

2020-11-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95862

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:049ce9d233e2d865dc81a5042b1c28ee21d1c9d8

commit r11-5366-g049ce9d233e2d865dc81a5042b1c28ee21d1c9d8
Author: Jakub Jelinek 
Date:   Wed Nov 25 15:42:38 2020 +0100

middle-end: __builtin_mul_overflow expansion improvements [PR95862]

The following patch adds some improvements for __builtin_mul_overflow
expansion.
One optimization is for the u1 * u2 -> sr case, as documented we normally
do:
 u1 * u2 -> sr
res = (S) (u1 * u2)
ovf = res < 0 || main_ovf (true)
where main_ovf (true) stands for jump on unsigned multiplication overflow.
If we know that the most significant bits of both operands are clear (such
as when they are zero extended from something smaller), we can
emit better coe by handling it like s1 * s2 -> sr, i.e. just jump on
overflow after signed multiplication.

Another two cases are s1 * s2 -> ur or s1 * u2 -> ur, if we know the
minimum
precision needed to encode all values of both arguments summed together
is smaller or equal to destination precision (such as when the two
arguments
are sign (or zero) extended from half precision types, we know the
overflows
happen only iff one argument is negative and the other argument is positive
(not zero), because even if both have maximum possible values, the maximum
is still representable (e.g. for char * char -> unsigned short
0x7f * 0x7f = 0x3f01 and for char * unsigned char -> unsigned short
0x7f * 0xffU = 0x7e81) and as the result is unsigned, all negative results
do overflow, but are also representable if we consider the result signed
- all of them have the MSB set.  So, it is more efficient to just
do the normal multiplication in that case and compare the result considered
as signed value against 0, if it is smaller, overflow happened.

And the get_min_precision change is to improve the char to short handling,
we have there in the IL
  _2 = (int) arg_1(D);
promotion from C promotions from char or unsigned char arg, and the caller
adds a NOP_EXPR cast to short or unsigned short.  get_min_precision punts
on the narrowing cast though, it handled only widening casts, but we can
handle narrowing casts fine too, by recursing on the narrowing cast
operands
and using it only if it has in the end smaller minimal precision, which
would duplicate the sign bits (or zero bits) to both the bits above the
narrowing conversion and also at least one below that.

2020-10-25  Jakub Jelinek  

PR rtl-optimization/95862
* internal-fn.c (get_min_precision): For narrowing conversion,
recurse
on the operand and if the operand precision is smaller than the
current one, return that smaller precision.
(expand_mul_overflow): For s1 * u2 -> ur and s1 * s2 -> ur cases
if the sum of minimum precisions of both operands is smaller or
equal
to the result precision, just perform normal multiplication and
set overflow to the sign bit of the multiplication result.  For
u1 * u2 -> sr if both arguments have the MSB known zero, use
normal s1 * s2 -> sr expansion.

* gcc.dg/builtin-artih-overflow-5.c: New test.

[Bug rtl-optimization/95862] Failure to optimize usage of __builtin_mul_overflow to small __int128-based check

2020-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95862

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Created attachment 49618
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49618=edit
gcc11-pr95862.patch

Untested fix.

[Bug rtl-optimization/95862] Failure to optimize usage of __builtin_mul_overflow to small __int128-based check

2020-11-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95862

Jakub Jelinek  changed:

   What|Removed |Added

  Component|tree-optimization   |rtl-optimization
   Last reconfirmed||2020-11-24
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

--- Comment #2 from Jakub Jelinek  ---
I don't really see why it woiuld need to do the 128-bit multiplication at all,
it can just do ((int64_t) a * b) < 0 (aka ((uint64_t) a * b) >> 63).