[Bug target/77729] aarch64 inserts unneeded uxtb after ldrb, orr ...32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 Segher Boessenkool changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |segher at gcc dot gnu.org --- Comment #12 from Segher Boessenkool --- I have a patch.
[Bug target/77729] aarch64 inserts unneeded uxtb after ldrb, orr ...32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comment #11 from Segher Boessenkool --- > Basically combine is going funny. Hey, don't blame combine, it's all simplify-rtx's doing. If in simplify_binary_operation_1 you disable the blocks after /* Minimize the number of bits set in C1, i.e. C1 := C1 & ~C2. */ and after /* If we have (ior (and (X C1) C2)), simplify this by making C1 as small as possible if C1 actually changes. */ all works as you want (this is on rs6000, pretty much the same thing): === Trying 10, 11 -> 12: Failed to match this instruction: (set (reg:SI 131) (ior:SI (zero_extend:SI (mem:QI (reg/v/f:DI 128 [ string ]) [0 *string_9(D)+0 S1 A8])) (const_int 96 [0x60]))) Successfully matched this instruction: (set (reg:SI 130) (zero_extend:SI (mem:QI (reg/v/f:DI 128 [ string ]) [0 *string_9(D)+0 S1 A8]))) Successfully matched this instruction: (set (reg:SI 131) (ior:SI (reg:SI 130) (const_int 96 [0x60]))) allowing combination of insns 10, 11 and 12 original costs 8 + 4 + 4 = 16 replacement costs 8 + 4 = 12 deferring deletion of insn with uid = 10. modifying insn i211: r130:SI=zero_extend([r128:DI]) deferring rescan insn with uid = 11. modifying insn i312: r131:SI=r130:SI|0x60 REG_DEAD r130:SI deferring rescan insn with uid = 12. === giving lbz 9,0(3) ori 9,9,0x60 cmpwi 7,9,116 bne 7,.L5 etc.
[Bug target/77729] aarch64 inserts unneeded uxtb after ldrb, orr ...32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 Andrew Pinski changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed||2016-09-25 Ever confirmed|0 |1 Severity|normal |enhancement --- Comment #10 from Andrew Pinski --- This is how we expand it on aarch64: (insn 10 9 11 (set (reg:QI 81) (mem:QI (reg/v/f:DI 80 [ string ]) [0 *string_9(D)+0 S1 A8])) t.c:4 -1 (nil)) (insn 11 10 12 (set (reg:SI 82) (ior:SI (subreg:SI (reg:QI 81) 0) (const_int 32 [0x20]))) t.c:4 -1 (nil)) (insn 12 11 13 (set (reg:SI 83) (zero_extend:SI (subreg:QI (reg:SI 82) 0))) t.c:4 -1 (nil)) (insn 13 12 14 (set (reg:CC 66 cc) (compare:CC (reg:SI 83) (const_int 116 [0x74]))) t.c:4 -1 (nil)) - (set (reg:SI 83) (ior:SI (and:SI (subreg:SI (mem:QI (reg/v/f:DI 80 [ string ]) [0 *string_9(D)+0 S1 A8]) 0) (const_int 223 [0xdf])) (const_int 32 [0x20]))) Notice how the and there is 223, but really that can be still a zero_extend. Basically combine is going funny. CUT Note for 33, orr does not accept 33 so combine does not see 33 and does not change the and part around the subreg.
[Bug target/77729] aarch64 inserts unneeded uxtb after ldrb, orr ...32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 --- Comment #9 from Julian Andres Klode--- (In reply to Andrew Pinski from comment #8) > This looks like missing removal of casts. > > Note in C, char_var|32 is really the same as ((int)char_var)|32 Well. The loads of the byte are already zero-extend loads. The current code is like saying: (int) ((int)char|32) :)
[Bug target/77729] aarch64 inserts unneeded uxtb after ldrb, orr ...32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 --- Comment #8 from Andrew Pinski --- This looks like missing removal of casts. Note in C, char_var|32 is really the same as ((int)char_var)|32
[Bug target/77729] aarch64 inserts unneeded uxtb after ldrb, orr ...32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 Florian Weimer changed: What|Removed |Added CC||fw at gcc dot gnu.org --- Comment #7 from Florian Weimer --- Something similar happens on x86_64 with -Os. The OR instruction operates on %eax instead of %al: : 0: 8a 07 mov(%rdi),%al 2: 83 c8 20or $0x20,%eax 5: 3c 74 cmp$0x74,%al (Byte-wise OR would be 0x0c 0x20, one byte shorter.)
[Bug target/77729] aarch64 inserts unneeded uxtb after ldrb, orr ...32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 --- Comment #6 from Julian Andres Klode--- (In reply to Andrew Pinski from comment #4) > Note this testcase needs to be improved as I have a patch which converts a > switch with just one case and a default into anew if statement. FWIW, The issue is exactly the same with if statements, like in: int TrieCase3(const char *string) { if((string[0] | 32) == 't') { if((string[1] | 32) == 'a') { if((string[2] | 32) == 'g') { return 42; } } } return -1; }
[Bug target/77729] aarch64 inserts unneeded uxtb after ldrb, orr ...32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 --- Comment #5 from Julian Andres Klode--- Created attachment 39678 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39678=edit ppc64le Hmm, AFAICT the same seems to happen on powerpc64le: lbz 9,0(3) # Load zero ori 9,9,0x20# ors in 32 rlwinm 9,9,0,0xff # zero extend value AFAICT cmpwi 7,9,116 So far tested: good: mipsel, x86_64, armhf (thumb2) bad: aarch64, powerpc64le
[Bug target/77729] aarch64 inserts unneeded uxtb after ldrb, orr ...32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 Andrew Pinski changed: What|Removed |Added CC||pinskia at gcc dot gnu.org --- Comment #4 from Andrew Pinski --- Note this testcase needs to be improved as I have a patch which converts a switch with just one case and a default into anew if statement.
[Bug target/77729] aarch64 inserts unneeded uxtb after ldrb, orr ...32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 --- Comment #3 from Julian Andres Klode--- Created attachment 39677 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39677=edit arm (thumb2) output at -O2
[Bug target/77729] aarch64 inserts unneeded uxtb after ldrb, orr ...32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 --- Comment #2 from Julian Andres Klode--- Created attachment 39676 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39676=edit Aarch64 output at -O2
[Bug target/77729] aarch64 inserts unneeded uxtb after ldrb, orr ...32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729 --- Comment #1 from Julian Andres Klode--- Created attachment 39675 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39675=edit C reproducer