[Bug middle-end/78904] zero-extracts are not effective
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904 --- Comment #8 from uros at gcc dot gnu.org --- Author: uros Date: Mon Dec 26 19:00:47 2016 New Revision: 243929 URL: https://gcc.gnu.org/viewcvs?rev=243929=gcc=rev Log: PR target/78904 * config/i386/i386.md (addqi_ext_1): Canonicalize insn pattern w.r.t. zero_extract RTXes. (*addqi_ext_2): Ditto. (testqi_ext_ccno_0): Canonicalize expander w.r.t. zero_extract RTXes. (testqi_ext_1_ccno): Rename from testqi_ext_ccno_0. (*testqi_ext_0): Merge with *testqi_ext_1. (*testqi_ext_1): Canonicalize insn pattern w.r.t. zero_extract RTXes. Update corresponding splitter. (*testqi_ext_2): Canonicalize insn pattern w.r.t. zero_extract RTXes. (*andqi_ext_0): Merge with *andqi_ext_1. (andqi_ext_1): Canonicalize insn pattern w.r.t. zero_extract RTXes. Rename from *andqi_ext_1. Update corresponding splitter and peephole2 patterns. (*andqi_ext_1_cc): Rename from *andqi_ext_0_cc. (*andqi_ext_2): Canonicalize insn pattern w.r.t. zero_extract RTXes. (*qi_ext_0): Merge with *andqi_ext_1. (*qi_ext_1): Canonicalize insn pattern w.r.t. zero_extract RTXes. Update corresponding splitter. (*qi_ext_2): Canonicalize insn pattern w.r.t. zero_extract RTXes. (xorqi_cc_ext_1): Canonicalize expander w.r.t. zero_extract RTXes. (xorqi_ext_1_cc): Rename from xorqi_cc_ext_1. (*xorqi_cc_ext_1): Canonicalize insn pattern w.r.t. zero_extract RTXes. Update corresponding splitter. (*xorqi_ext_1_cc): Rename from *xorqi_cc_ext_1. (isinfxf2): Update calls to renamed expanders. (isinf2): Ditto. * config/i386/i386.c (ix86_expand_fp_compare): Ditto. (ix86_emit_fp_unordered_jump): Ditto. (ix86_emit_i387_round): Ditto. testsuite/ChangeLog: PR target/78904 * gcc.target/i386/pr78904.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr78904.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/i386.md trunk/gcc/testsuite/ChangeLog
[Bug middle-end/78904] zero-extracts are not effective
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904 --- Comment #7 from Segher Boessenkool --- Ah, "high byte" registers are never a separate register in the i386 backend, I see. combine would need to combine four insns to arrive at your current pattern, but it doesn't try because it thinks they are too complex. If I change this code to count ZERO_EXTRACT in nshift as well, combine does try, but comes up with (set (zero_extract:SI (reg:SI 92 [ D.1801 ]) (const_int 8 [0x8]) (const_int 8 [0x8])) (subreg:SI (plus:QI (subreg:QI (zero_extract:SI (reg/v:SI 94 [ a ]) (const_int 8 [0x8]) (const_int 8 [0x8])) 0) (subreg:QI (zero_extract:SI (reg/v:SI 95 [ b ]) (const_int 8 [0x8]) (const_int 8 [0x8])) 0)) 0)) (so it does the plus as QImode).
[Bug middle-end/78904] zero-extracts are not effective
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904 --- Comment #6 from Uroš Bizjak --- A better testcase: --cut here-- struct S1 { char pad1; unsigned char val; short pad2; }; struct S1 test (struct S1 a, struct S1 b) { a.val += b.val; return a; } --cut here-- compiles with -O2 to: movl%edi, %eax movl%esi, %ecx movzbl %ch, %edx movzbl %ah, %esi addl%esi, %edx movb%dl, %ah with patched compiler: movl%edi, %eax movl%esi, %edx addb%ah, %dh movl%edx, %eax
[Bug middle-end/78904] zero-extracts are not effective
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904 --- Comment #5 from Uroš Bizjak --- Created attachment 40405 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40405=edit Prototype patch for addqi_ext_[1,2] patterns The prototype patch compiles the testcase to: movl%edi, %edx movl%esi, %eax xorw%di, %di addb%dh, %ah movzwl %ax, %eax orl %edi, %eax
[Bug middle-end/78904] zero-extracts are not effective
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904 --- Comment #4 from Uroš Bizjak --- (In reply to Segher Boessenkool from comment #3) > (In reply to Uroš Bizjak from comment #2) > > No, unfortunately the above is not a valid x86 insn. x86 has two-operand > > instructions, so output has to match one of the operands. > > But these are pseudos. Please note that subreg:QI (zero_extract:SI (reg/v:SI 94 [ a ]) (const_int 8 [0x8]) (const_int 8 [0x8])) 0) describes the high-part of a 16bit register, e.g. %ah, while (reg:QI 88 [ _2 ]) describes low-part, e.g. %al The pattern from the comment #1 models %al <- %ah + %bh, which is not a valid x86 insn. > > It looks that combine prefers: > > > >(subreg:SI (plus:QI > > and > >(subreg:QI (zero_extract:SI (... op ...) > > > > while - according to the existing x86 patterns - in the past the patterns > > were seemingly combined into: > > > > (plus:SI > > and > > (zero_extract:SI (... op ...) > > So when did this change? > > Combine in general prefers smaller modes. I don't know when, perhaps with the rewrite of extzv pattern, when mode was added to these patterns? Anyway, I see that some patterns, e.g. *cmpqi_ext_1 are written using (subreg:QI (zero_extract:SI ...)), which seems consistent with your observation in Comment #1. Following your explanation, it looks that: (set (zero_extract:SI (reg:SI 92 [ D.1804 ]) (const_int 8 [0x8]) (const_int 8 [0x8])) (subreg:SI (_op_:QI ... corresponds to setting of the highpart register (%ah), and consequently arith and logic patterns in i386.md have to be rewritten.
[Bug middle-end/78904] zero-extracts are not effective
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904 --- Comment #3 from Segher Boessenkool --- (In reply to Uroš Bizjak from comment #2) > No, unfortunately the above is not a valid x86 insn. x86 has two-operand > instructions, so output has to match one of the operands. But these are pseudos. > It looks that combine prefers: > >(subreg:SI (plus:QI > and >(subreg:QI (zero_extract:SI (... op ...) > > while - according to the existing x86 patterns - in the past the patterns > were seemingly combined into: > > (plus:SI > and > (zero_extract:SI (... op ...) So when did this change? Combine in general prefers smaller modes.
[Bug middle-end/78904] zero-extracts are not effective
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904 --- Comment #2 from Uroš Bizjak --- (In reply to Segher Boessenkool from comment #1) > === > Trying 10, 9 -> 11: > Failed to match this instruction: > (parallel [ > (set (reg:QI 88 [ _2 ]) > (plus:QI (subreg:QI (zero_extract:SI (reg/v:SI 94 [ a ]) > (const_int 8 [0x8]) > (const_int 8 [0x8])) 0) > (subreg:QI (zero_extract:SI (reg/v:SI 95 [ b ]) > (const_int 8 [0x8]) > (const_int 8 [0x8])) 0))) > (clobber (reg:CC 17 flags)) > ]) > === > > and then it later tries 10, 9 -> 11 and 9, 11 -> 13 etc., but you need > 11, 10, 9 -> 13 to match your insn pattern. > > Maybe you want a pattern for the result of 10, 9 -> 11? It is a valid > insn for i386 I think? No, unfortunately the above is not a valid x86 insn. x86 has two-operand instructions, so output has to match one of the operands. It looks that combine prefers: (subreg:SI (plus:QI and (subreg:QI (zero_extract:SI (... op ...) while - according to the existing x86 patterns - in the past the patterns were seemingly combined into: (plus:SI and (zero_extract:SI (... op ...)
[Bug middle-end/78904] zero-extracts are not effective
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904 Segher Boessenkool changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2016-12-22 Ever confirmed|0 |1 --- Comment #1 from Segher Boessenkool --- Before combine we have === (insn 9 7 10 2 (set (subreg:SI (reg:QI 97) 0) (zero_extract:SI (reg/v:SI 95 [ b ]) (const_int 8 [0x8]) (const_int 8 [0x8]))) "78904.c":12 105 {*extzvsi} (expr_list:REG_DEAD (reg/v:SI 95 [ b ]) (nil))) (insn 10 9 11 2 (set (subreg:SI (reg:QI 98 [ a$val ]) 0) (zero_extract:SI (reg/v:SI 94 [ a ]) (const_int 8 [0x8]) (const_int 8 [0x8]))) "78904.c":12 105 {*extzvsi} (nil)) (insn 11 10 26 2 (parallel [ (set (reg:QI 88 [ _2 ]) (plus:QI (reg:QI 97) (reg:QI 98 [ a$val ]))) (clobber (reg:CC 17 flags)) ]) "78904.c":12 214 {*addqi_1} (expr_list:REG_DEAD (reg:QI 98 [ a$val ]) (expr_list:REG_DEAD (reg:QI 97) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil) (insn 13 12 14 2 (set (zero_extract:SI (reg:SI 92 [ D.1801 ]) (const_int 8 [0x8]) (const_int 8 [0x8])) (subreg:SI (reg:QI 88 [ _2 ]) 0)) "78904.c":14 109 {insvsi_1} (expr_list:REG_DEAD (reg:QI 88 [ _2 ]) (nil))) === and combine then does === Trying 10, 9 -> 11: Failed to match this instruction: (parallel [ (set (reg:QI 88 [ _2 ]) (plus:QI (subreg:QI (zero_extract:SI (reg/v:SI 94 [ a ]) (const_int 8 [0x8]) (const_int 8 [0x8])) 0) (subreg:QI (zero_extract:SI (reg/v:SI 95 [ b ]) (const_int 8 [0x8]) (const_int 8 [0x8])) 0))) (clobber (reg:CC 17 flags)) ]) === and then it later tries 10, 9 -> 11 and 9, 11 -> 13 etc., but you need 11, 10, 9 -> 13 to match your insn pattern. Maybe you want a pattern for the result of 10, 9 -> 11? It is a valid insn for i386 I think?