[Bug tree-optimization/64322] More optimize opportunity for constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64322 Andrew Pinski changed: What|Removed |Added Resolution|--- |FIXED Target Milestone|--- |5.0 Status|NEW |RESOLVED --- Comment #11 from Andrew Pinski --- (In reply to Richard Biener from comment #10) > (In reply to Martin Liška from comment #9) > > Can the bug be marked as resolved? > > Did you check? At -Os we now produce for A > > movq$0, -16(%rsp) > movl$1, %eax > salq$32, %rax > movq%rax, -8(%rsp) > movq-16(%rsp), %rax > xorl%eax, %eax > ret movabs $0x1,%rax is 10 bytes while mov$0x1,%eax shl$0x20,%rax is also 9 bytes. Counting instructions on x86_64 with -Os is not always the best thing to do :). Yes it is fixed and even improved.
[Bug tree-optimization/64322] More optimize opportunity for constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64322 Richard Biener changed: What|Removed |Added Last reconfirmed|2014-12-16 00:00:00 |2018-11-19 --- Comment #10 from Richard Biener --- (In reply to Martin Liška from comment #9) > Can the bug be marked as resolved? Did you check? At -Os we now produce for A movq$0, -16(%rsp) movl$1, %eax salq$32, %rax movq%rax, -8(%rsp) movq-16(%rsp), %rax xorl%eax, %eax ret that's not folded (the shift). It was worse, yes, but it's not fixed yet.
[Bug tree-optimization/64322] More optimize opportunity for constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64322 Martin Liška changed: What|Removed |Added CC||marxin at gcc dot gnu.org --- Comment #9 from Martin Liška --- Can the bug be marked as resolved?
[Bug tree-optimization/64322] More optimize opportunity for constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64322 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug tree-optimization/64322] More optimize opportunity for constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64322 --- Comment #6 from Jakub Jelinek jakub at gcc dot gnu.org --- Author: jakub Date: Wed Dec 17 09:29:12 2014 New Revision: 218812 URL: https://gcc.gnu.org/viewcvs?rev=218812root=gccview=rev Log: PR tree-optimization/64322 * tree-vrp.c (extract_range_from_binary_expr_1): Attempt to derive range for RSHIFT_EXPR even if vr0 range is not VR_RANGE or is symbolic. * gcc.dg/tree-ssa/vrp95.c: New test. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp95.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vrp.c
[Bug tree-optimization/64322] More optimize opportunity for constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64322 --- Comment #7 from Jakub Jelinek jakub at gcc dot gnu.org --- So, is this fix acceptable to the reporter? The explanation in the combiner is that in the first testcase you have multiple uses of the load of 0x1L constant and therefore it is not attempted to be combined with the second use (division), changing that is undesirable I think, combine is already expensive as is.
[Bug tree-optimization/64322] More optimize opportunity for constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64322 --- Comment #8 from rguenther at suse dot de rguenther at suse dot de --- On Wed, 17 Dec 2014, jakub at gcc dot gnu.org wrote: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64322 --- Comment #7 from Jakub Jelinek jakub at gcc dot gnu.org --- So, is this fix acceptable to the reporter? The explanation in the combiner is that in the first testcase you have multiple uses of the load of 0x1L constant and therefore it is not attempted to be combined with the second use (division), changing that is undesirable I think, combine is already expensive as is. True, though eventually changing this just for constants (thus (const ...) and CONST_INT and ...) might be worth the additional overhead. I can imagine targets that don't support (large) immediates being pessimized very much otherwise.
[Bug tree-optimization/64322] More optimize opportunity for constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64322 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2014-12-16 Ever confirmed|0 |1 --- Comment #2 from Richard Biener rguenth at gcc dot gnu.org --- Well, I rather wonder how the store to an otherwise unused value can affect the code. It doesn't on the GIMPLE level. I suppose RTL somehow invalidly thinks that after b = 0; c = 0; b is still zero (so it only considers the two volatiles aliasing but doesn't consider the b volatile). It seems to be combine optimizing the case with c = 0, not sure why it doens't with c = 0x100L. Ah, the large constant is split to a reg. Difference: -Successfully matched this instruction: -(set (reg:DI 98) -(ashift:DI (reg:DI 94 [ D.1849 ]) -(const_int 1 [0x1]))) -Successfully matched this instruction: -(set (reg:DI 97 [ a ]) -(const_int 0 [0])) -allowing combination of insns 10, 11 and 12 -original costs 4 + 3 + 0 = 0 -replacement costs 6 + 4 = 10 -deferring deletion of insn with uid = 10. -modifying insn i211: {r98:DI=r94:DI0x1;clobber flags:CC;} Otherwise I agree with Jakub that VRP should be enhanced (it's weak with handling non-initeger-VR_RANGES for most codes). But combine probably exposes a RTL simplification so I wonder if we can add a similar one (after figuring out which one applies) on the GIMPLE level. Confirmed (at -O2 both codes are bad).
[Bug tree-optimization/64322] More optimize opportunity for constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64322 --- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org --- For VRP I'm thinking of (completely untested): --- gcc/tree-vrp.c.jj2014-12-01 14:57:30.0 +0100 +++ gcc/tree-vrp.c2014-12-16 10:17:27.543111649 +0100 @@ -2434,6 +2434,7 @@ extract_range_from_binary_expr_1 (value_ code != MAX_EXPR code != PLUS_EXPR code != MINUS_EXPR + code != RSHIFT_EXPR (vr0.type == VR_VARYING || vr1.type == VR_VARYING || vr0.type != vr1.type @@ -2948,6 +2949,15 @@ extract_range_from_binary_expr_1 (value_ { if (code == RSHIFT_EXPR) { + /* Even if vr0 is VARYING or otherwise not usable, we can derive + useful ranges just from the shift count. E.g. + x 63 for signed 64-bit x is always [-1, 0]. */ + if (vr0.type != VR_RANGE || symbolic_range_p (vr0)) +{ + vr0.type = type = VR_RANGE; + vr0.min = vrp_val_min (expr_type); + vr0.max = vrp_val_max (expr_type); +} extract_range_from_multiplicative_op_1 (vr, code, vr0, vr1); return; }
[Bug tree-optimization/64322] More optimize opportunity for constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64322 --- Comment #4 from Richard Biener rguenth at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #3) For VRP I'm thinking of (completely untested): --- gcc/tree-vrp.c.jj 2014-12-01 14:57:30.0 +0100 +++ gcc/tree-vrp.c2014-12-16 10:17:27.543111649 +0100 @@ -2434,6 +2434,7 @@ extract_range_from_binary_expr_1 (value_ code != MAX_EXPR code != PLUS_EXPR code != MINUS_EXPR + code != RSHIFT_EXPR (vr0.type == VR_VARYING || vr1.type == VR_VARYING || vr0.type != vr1.type @@ -2948,6 +2949,15 @@ extract_range_from_binary_expr_1 (value_ { if (code == RSHIFT_EXPR) { + /* Even if vr0 is VARYING or otherwise not usable, we can derive + useful ranges just from the shift count. E.g. + x 63 for signed 64-bit x is always [-1, 0]. */ + if (vr0.type != VR_RANGE || symbolic_range_p (vr0)) + { + vr0.type = type = VR_RANGE; + vr0.min = vrp_val_min (expr_type); + vr0.max = vrp_val_max (expr_type); + } Yeah, that should work. We should probably simply handle all operation codes that do not explicitely handle non-simple VR_RANGEs by promoting all operands that way (also handle the single-VR_UNDEFINED op case and VR_VARYING generally that way). The DIV and MOD_EXPR cases look like they would benefit from that. extract_range_from_multiplicative_op_1 (vr, code, vr0, vr1); return; }
[Bug tree-optimization/64322] More optimize opportunity for constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64322 --- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org --- (In reply to Richard Biener from comment #4) (In reply to Jakub Jelinek from comment #3) For VRP I'm thinking of (completely untested): --- gcc/tree-vrp.c.jj 2014-12-01 14:57:30.0 +0100 +++ gcc/tree-vrp.c 2014-12-16 10:17:27.543111649 +0100 @@ -2434,6 +2434,7 @@ extract_range_from_binary_expr_1 (value_ code != MAX_EXPR code != PLUS_EXPR code != MINUS_EXPR + code != RSHIFT_EXPR (vr0.type == VR_VARYING || vr1.type == VR_VARYING || vr0.type != vr1.type @@ -2948,6 +2949,15 @@ extract_range_from_binary_expr_1 (value_ { if (code == RSHIFT_EXPR) { + /* Even if vr0 is VARYING or otherwise not usable, we can derive +useful ranges just from the shift count. E.g. +x 63 for signed 64-bit x is always [-1, 0]. */ + if (vr0.type != VR_RANGE || symbolic_range_p (vr0)) + { + vr0.type = type = VR_RANGE; + vr0.min = vrp_val_min (expr_type); + vr0.max = vrp_val_max (expr_type); + } Yeah, that should work. We should probably simply handle all operation codes that do not explicitely handle non-simple VR_RANGEs by promoting all operands that way (also handle the single-VR_UNDEFINED op case and VR_VARYING generally that way). The DIV and MOD_EXPR cases look like they would benefit from that DIV and MOD already handle it (DIV quite similarly to this). And from the list of codes that extract_range_from_binary_expr_1 handles, I think RSHIFT_EXPR is the only one that (for certain VR_RANGEs of one argument) can decrease a VR_VARYING into something narrower and didn't handle arbitrary ranges of the other operand yet.
[Bug tree-optimization/64322] More optimize opportunity for constant folding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64322 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org --- First of all, I wonder why already VRP can't handle this. To do so it would need to add RSHIFT_EXPR to the codes that we handle different range kinds form (like PLUS_EXPR, BIT_AND_EXPR etc.), and simply if the second range is range_int_cst_p and the first range is not VR_RANGE, just pretend the first range is VR_RANGE from min to max value. Say on the x 63 from the testcase it should figure out the VR is [-1, 0] and thus for double that [-2, 0] and for division by 0x1L that it must be 0.