[Bug middle-end/48580] missed optimization: integer overflow checks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 --- Comment #25 from Andrew Pinski --- We have this now: if (tmp.3_3 > 0) goto ; [59.00%] else goto ; [41.00%] [local count: 633507679]: _10 = _12 == 0; [local count: 1073741824]: # iftmp.2_5 = PHI <_10(3), 0(2)> I suspect what we could do is in isel change that to: iftmp.2_5 = tmp.3_3 > 0 ? _12 == 0 : 0; (which itself gets optimized to: iftmp.2_5 = (tmp.3_3 > 0) & (_12 == 0) Which should produce better code at least.
[Bug middle-end/48580] missed optimization: integer overflow checks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 Andrew Pinski changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org --- Comment #24 from Andrew Pinski --- Mine. I had a patch which improves the "== 0" to just ^1. Let me see if I can finish up.
[Bug middle-end/48580] missed optimization: integer overflow checks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=59708 CC||pinskia at gcc dot gnu.org --- Comment #23 from Andrew Pinski --- Also the builtins have been in GCC since GCC 5; r5-4844, PR 59708.
[Bug middle-end/48580] missed optimization: integer overflow checks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 --- Comment #22 from Andrew Pinski --- For the original testcase in comment #0 we produce (in GCC 11+): movl%edi, %eax mull%esi seto%dl xorl%r8d, %r8d movzbl %dl, %edx testl %eax, %eax jle .L1 testl %edx, %edx sete%r8b .L1: movl%r8d, %eax ret --- CUT I have a patch which I think improves the code even more. The gimple level looks like this correctly: x.0_1 = (unsigned int) x_6(D); y.1_2 = (unsigned int) y_7(D); _11 = .MUL_OVERFLOW (x.0_1, y.1_2); tmp_8 = REALPART_EXPR <_11>; tmp.3_3 = (int) tmp_8; if (tmp.3_3 > 0) goto ; [59.00%] else goto ; [41.00%] [local count: 633507680]: _12 = IMAGPART_EXPR <_11>; _10 = _12 == 0; [local count: 1073741824]: # iftmp.2_5 = PHI <_10(3), 0(2)> Notice no divide. The _12 == 0 part really should just _12 ^ 1. After my patch (which I need to finish up) we get: movl%edi, %eax mull%esi seto%dl xorl%r8d, %r8d movzbl %dl, %edx xorl$1, %edx testl %eax, %eax cmovg %edx, %r8d movl%r8d, %eax ret Which should be exactly what you wanted or very close. There looks to be a few micro-optimizations needed still really.
[Bug middle-end/48580] missed optimization: integer overflow checks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 --- Comment #21 from Martin von Gagern Martin.vGagern at gmx dot net --- (In reply to myself from comment #15) (In reply to comment #7) […] built-in operations where you can just say multiply two (signed) values, check whether the result fits in 31-bit unsigned and set an overflow flag accordingly. Would be easier to read, easier to maintain, and less difficult to detect. Sounds like an overall win. I'm very much in favor of builtins like these. I'd rather make the overflow indicator the return value, and transfer the result to a location indicated via a pointer. Cross reference: bug #59708 is going in this direction, motivated by a precedence set by clang. (In reply to Zack Weinberg from comment #18) I just want the code in the original bug report to be optimized. I suggested duping bug #49467 there instead, in which case this issue here could return to the originally requested goal of detecting idioms without relying on builtins.
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 --- Comment #20 from Marc Glisse glisse at gcc dot gnu.org --- (In reply to Zack Weinberg from comment #5) Addendum: what would *you* describe as the correct C idiom for ensuring that the product of two signed integers was positive and did not overflow the range of a same-sized signed integer, assuming nothing about either multiplicand? The most natural way to do it depends on whether you have access to a wider type (and may rely on modular casts to signed integer types as guaranteed by gcc). For instance if you want a non-negative result that fits an int: return x * (unsigned long long)(y) = __INT_MAX__; or if you only care about signed overflow and not the sign of the result: long long l = x * (long long)(y); return l == (int) l; The value of the overflow flags after a multiplication doesn't seem modeled in i386.md currently (apart from clobbered), so it won't be used, but the code generated is not too horrible.
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 Martin von Gagern Martin.vGagern at gmx dot net changed: What|Removed |Added CC||Martin.vGagern at gmx dot ||net --- Comment #15 from Martin von Gagern Martin.vGagern at gmx dot net 2013-02-02 14:03:12 UTC --- (In reply to comment #7) […] built-in operations where you can just say multiply two (signed) values, check whether the result fits in 31-bit unsigned and set an overflow flag accordingly. Would be easier to read, easier to maintain, and less difficult to detect. Sounds like an overall win. I'm very much in favor of builtins like these. (In reply to comment #9) arith_mul_signed_check (a, b, 32, ARITH_WRAP, overflow_p) I'd rather make the overflow indicator the return value, and transfer the result to a location indicated via a pointer. I.e. overflow_p = arith_mul_signed_check (a, b, 32, ARITH_WRAP, a_times_b) One reason is that one usually wants to check for overflow BEFORE using the result, i.e. the common use case would likely use the above as a condition of some branching construct. A second reason is that this would allow for a flavor which will not modify a_times_b in case of an overflow, which might be useful for some scenarios, particularly if the result is stored in the same location as one of the operands, thus overwriting the operand. So an unchecked a*=b could be transformed into the checked construct overflow_p = arith_mul_signed_check (a, b, 32, ARITH_CAREFUL, a) (In reply to comment #14) The trouble is that the nature of an operation is more a property of the operation than of the type I agree. The main point of all of this is optimization. And in terms of optimization, one would want to examine some flag immediately after an operation setting that flag. One would act upon the flag, and then discard it. If the information were part of the type, one would require extra storage to save those flags, which would lead to a change in the size of these types, which in turn would severely impact many other things.
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 --- Comment #16 from Jeffrey Walton noloader at gmail dot com 2013-02-02 17:01:55 UTC --- (In reply to comment #15) I agree. The main point of all of this is optimization. And in terms of optimization, one would want to examine some flag immediately after an operation setting that flag. One would act upon the flag, and then discard it. I somewhat disagree. A program must be correct; it should be secure; and it can be efficient. I'm interested in correct and secure. If a program silently overflows, its surely not correct. If an adversary can do something with the errant result, its not secure either. What's the point in doing something wrong but doing it quickly? Jeff
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 --- Comment #17 from Martin von Gagern Martin.vGagern at gmx dot net 2013-02-02 18:54:43 UTC --- (In reply to comment #16) I somewhat disagree. A program must be correct; it should be secure; and it can be efficient. I'm interested in correct and secure. If a program silently overflows, its surely not correct. I'm not talking about silently ignoring overflows, quite the contrary. Always doing the correct thing leads to arbitrary size integers. Checking all (signed) arithmetic leads to -ftrapv. Checking some arithmetic might perhaps be achieved with the signalling types from comment #12, although semantics for mixed types might be problematic. The non-signalling versions will only improve things if one actually checks the additional information after the operation, which might easily be forgotten. Checking individual operations could also (and in my opinion better) be achieved with builtins, and in this case a warning could be issued if the return value indicating the overflow is ignored. Builtins might even allow using specific overflow semantics for code otherwise compiled with -ftrapv, thus increasing the usability of that flag.
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 --- Comment #18 from Zack Weinberg zackw at panix dot com 2013-02-02 21:59:37 UTC --- I find it a little disappointing that what should have been a straightforward additional optimization has gotten totally derailed into bikeshedding of an enormous class of builtins which seem unlikely ever to gain any traction. I just want the code in the original bug report to be optimized. That would be enough.
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 --- Comment #19 from Martin von Gagern Martin.vGagern at gmx dot net 2013-02-02 22:08:09 UTC --- Bug 49467 asked about builtins, and got duped here, so small wonder people wanting a builtin-colored bikeshed like I do end up here...
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 jules at gcc dot gnu.org changed: What|Removed |Added CC||jules at gcc dot gnu.org --- Comment #12 from jules at gcc dot gnu.org 2011-10-05 12:41:55 UTC --- I don't much like the idea of using builtins for operations as fundamental as integer arithmetic. How about this as a straw-man suggestion: adding new qualifiers for fat integers-with-flags, somewhat in the spirit of the embedded-C fractional/saturating types? So you might have: int x, y; void foo (void) { _Flagged int res; res = (_Flagged int) x + y; if (_Carry (res)) printf (sum carried\n); if (_Overflow (res)) printf (sum overflowed\n); } this avoids problems with global state, and allows for the programming style which (I vaguely get the impression that) people seem to want -- performing a normal integer operation, then querying for carry/overflow flags afterwards. These types wouldn't be allowed to cross ABI boundaries (although of course you could use the magic builtins _Carry/_Overflow to extract values to pass to functions), and _Overflow would only be allowed for signed types. You could also have _Borrow for subtraction maybe (whose meaning would be the inverse of _Carry). Signalling variants could look like, e.g.: void bar (void) { int res; res = (_Signalling _Flagged int) x + y; } Things would get awkward if you tried to use these new constructs in more-complicated expressions, I suppose. Anyway, just an idea.
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 --- Comment #13 from jules at gcc dot gnu.org 2011-10-05 13:05:47 UTC --- Coming to think of it, if _Sat were allowed on plain integers too, a _Flagged _Sat int could also be queried for saturation using a similar mechanism, like: int foo (_Sat int x, _Sat int y) { return _Saturated ((_Sat _Flagged) x + y); } I'm probably getting ahead of myself :-).
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 --- Comment #14 from joseph at codesourcery dot com joseph at codesourcery dot com 2011-10-05 15:19:01 UTC --- On Wed, 5 Oct 2011, jules at gcc dot gnu.org wrote: I don't much like the idea of using builtins for operations as fundamental as integer arithmetic. How about this as a straw-man suggestion: adding new qualifiers for fat integers-with-flags, somewhat in the spirit of the embedded-C fractional/saturating types? So you might have: The trouble is that the nature of an operation is more a property of the operation than of the type - and the proliferation of types for what should be operations on normal types is much of the problem with what the embedded-C TR does. You could have pragmas to say that a cumulative flag for a particular scope goes in a particular variable, and that normal operations have particular overflow semantics in that scope, maybe.
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 Joseph S. Myers jsm28 at gcc dot gnu.org changed: What|Removed |Added CC||noloader at gmail dot com --- Comment #11 from Joseph S. Myers jsm28 at gcc dot gnu.org 2011-06-20 09:46:59 UTC --- *** Bug 49467 has been marked as a duplicate of this bug. ***
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2011.04.13 12:11:47 CC||rguenth at gcc dot gnu.org Component|rtl-optimization|middle-end Ever Confirmed|0 |1 --- Comment #8 from Richard Guenther rguenth at gcc dot gnu.org 2011-04-13 12:11:47 UTC --- I too would see it as two pieces, pattern matching on trees to produce a canonical builtin call and RTL expansion of that for optimal target code generation (where I don't know whether we can do better than using UNSPECs). Note that usually you also want to use the result of the multiplication (if it didn't overflow), and using just a single multiplication might be even more complicated (if we need to go the UNSPEC way). For the latter reasons I think the builtins should be sth like __builtin_smul_ovfl_p (multiplication-result, op0, op1), thus pass in the multiplication result and keep the multiplication itself in the IL to also allow for regular optimizations to work on them. If the multiplication is just used as the builtin call argument expansion can get rid of it. The set of builtins with defined behavior is still useful, if not only to allow mixing of wrapv/trapv/... operations in C source. But it isn't exactly the form I'd like the operations to reside in the IL. Pattern-matching the multiplication overflow code can be tricky because of the various ways the handling of zero operands can be implemented. Implementing this entirely on the RTL side seems very tricky to me.
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 --- Comment #9 from joseph at codesourcery dot com joseph at codesourcery dot com 2011-04-13 12:45:35 UTC --- On Wed, 13 Apr 2011, rguenth at gcc dot gnu.org wrote: For the latter reasons I think the builtins should be sth like __builtin_smul_ovfl_p (multiplication-result, op0, op1), thus pass in the multiplication result and keep the multiplication itself in the IL to also allow for regular optimizations to work on them. If the multiplication is just used as the builtin call argument expansion can get rid of it. My inclination is that we probably want to separate the API for users from the built-in functions or other operations used internally - possibly providing a header gccarith.h with a supported API and saying the built-in functions are subject to change and should not be used directly in source code. Maybe (inspired by the style used on C1X stdatomic.h) one could have operations like arith_mul_signed (a, b, 32, ARITH_SAT) that multiplies a and b (integers to infinite precision, producing an infinite precision result), saturates the result to 32-bit signed and returns an implementation-defined type (signed, two's complement, at least 32 bits), or arith_mul_signed_check (a, b, 32, ARITH_WRAP, overflow_p) that stores an overflow flag in the provided _Bool *. Or the overflow handling (wrap, saturate, undefined, unspecified-value, trap) could be part of the macro/function name (putting it as a separate operand is inspired by the memory_order parameters in stdatomic.h). The _check versions with explicit overflow flags could be used with any overflow handling other than undefined behavior. (So if you don't care about what the value is when the overflow flag is set - if you'll just call some error handler - then you'd use unspecified-value as the overflow handling.) On divide by zero, both division and modulo operations would have undefined behavior except for the trap case - but modulo -1 would always be defined, unlike for C INT_MIN % -1 (I'm thinking of these operations as all being defined on their own by producing an infinite precision result first, but there might also be operations such as divmod producing multiple results). It may also make sense to expose different variations of division/modulo (rounding to 0, -infinity or +infinity; a true modulo operation, rounding to -infinity, is one of those things C doesn't make easy; GCC has these as TRUNC_DIV_EXPR, FLOOR_DIV_EXPR, CEIL_DIV_EXPR and corresponding *_MOD_EXPR). Starting with addition then increasing the set of operations gradually probably makes sense. Along with all the arithmetic operations you get range check operations - take a value (in any integer type), and check whether it is in the signed/unsigned range with a given number of bits, wrapping/saturating etc. and setting overflow flags as needed. These are just like the above but with one fewer operands. The explicit number of bits (required to be an integer constant expression) is deliberate because it is sometimes useful e.g. to saturate to 16 bits, but integer promotions in C make it rather fragile to rely on 16-bit operands remaining 16-bit rather than getting implicitly promoted to 32 bits. A consequence is that operations with strange numbers of bits can be written without involving bit-field types (since the result is allowed to have more than the given number of bits). I don't expect those to be very efficient in general, but note that ARM for example has SSAT and USAT instructions to saturate a value to a given number of bits (up to 32) and set a flag if it was out of range.
[Bug middle-end/48580] missed optimization: integer overflow checks
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48580 Steven Fuerst svfuerst at gmail dot com changed: What|Removed |Added CC||svfuerst at gmail dot com --- Comment #10 from Steven Fuerst svfuerst at gmail dot com 2011-04-13 17:44:22 UTC --- There are C and x86 assembly code fragments showing how to do signed and unsigned saturating arithmetic here: http://locklessinc.com/articles/sat_arithmetic/ Although, if there were new intrinsics that allowed direct access to the overflow and carry flags from arithmetic instructions, they would be quite useful.