[Bug middle-end/78904] zero-extracts are not effective

2016-12-26 Thread uros at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #8 from uros at gcc dot gnu.org ---
Author: uros
Date: Mon Dec 26 19:00:47 2016
New Revision: 243929

URL: https://gcc.gnu.org/viewcvs?rev=243929=gcc=rev
Log:
PR target/78904
* config/i386/i386.md (addqi_ext_1): Canonicalize insn pattern w.r.t.
zero_extract RTXes.
(*addqi_ext_2): Ditto.
(testqi_ext_ccno_0): Canonicalize expander w.r.t. zero_extract RTXes.
(testqi_ext_1_ccno): Rename from testqi_ext_ccno_0.
(*testqi_ext_0): Merge with *testqi_ext_1.
(*testqi_ext_1): Canonicalize insn pattern w.r.t. zero_extract RTXes.
Update corresponding splitter.
(*testqi_ext_2): Canonicalize insn pattern w.r.t. zero_extract RTXes.
(*andqi_ext_0): Merge with *andqi_ext_1.
(andqi_ext_1): Canonicalize insn pattern w.r.t. zero_extract RTXes.
Rename from *andqi_ext_1.  Update corresponding splitter and
peephole2 patterns.
(*andqi_ext_1_cc): Rename from *andqi_ext_0_cc.
(*andqi_ext_2): Canonicalize insn pattern w.r.t. zero_extract RTXes.
(*qi_ext_0): Merge with *andqi_ext_1.
(*qi_ext_1): Canonicalize insn pattern w.r.t.
zero_extract RTXes.  Update corresponding splitter.
(*qi_ext_2): Canonicalize insn pattern w.r.t.
zero_extract RTXes.
(xorqi_cc_ext_1): Canonicalize expander w.r.t. zero_extract RTXes.
(xorqi_ext_1_cc): Rename from xorqi_cc_ext_1.
(*xorqi_cc_ext_1): Canonicalize insn pattern w.r.t. zero_extract RTXes.
Update corresponding splitter.
(*xorqi_ext_1_cc): Rename from *xorqi_cc_ext_1.
(isinfxf2): Update calls to renamed expanders.
(isinf2): Ditto.
* config/i386/i386.c (ix86_expand_fp_compare): Ditto.
(ix86_emit_fp_unordered_jump): Ditto.
(ix86_emit_i387_round): Ditto.

testsuite/ChangeLog:

PR target/78904
* gcc.target/i386/pr78904.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/i386/pr78904.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/i386.md
trunk/gcc/testsuite/ChangeLog

[Bug middle-end/78904] zero-extracts are not effective

2016-12-22 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #7 from Segher Boessenkool  ---
Ah, "high byte" registers are never a separate register in the i386
backend, I see.

combine would need to combine four insns to arrive at your current
pattern, but it doesn't try because it thinks they are too complex.
If I change this code to count ZERO_EXTRACT in nshift as well, combine
does try, but comes up with

(set (zero_extract:SI (reg:SI 92 [ D.1801 ])
(const_int 8 [0x8])
(const_int 8 [0x8]))
(subreg:SI (plus:QI (subreg:QI (zero_extract:SI (reg/v:SI 94 [ a ])
(const_int 8 [0x8])
(const_int 8 [0x8])) 0)
(subreg:QI (zero_extract:SI (reg/v:SI 95 [ b ])
(const_int 8 [0x8])
(const_int 8 [0x8])) 0)) 0))

(so it does the plus as QImode).

[Bug middle-end/78904] zero-extracts are not effective

2016-12-22 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #6 from Uroš Bizjak  ---
A better testcase:

--cut here--
struct S1
{
  char pad1;
  unsigned char val;
  short pad2;
};

struct S1 test (struct S1 a, struct S1 b)
{
  a.val += b.val;

  return a;
}
--cut here--

compiles with -O2 to:

movl%edi, %eax
movl%esi, %ecx
movzbl  %ch, %edx
movzbl  %ah, %esi
addl%esi, %edx
movb%dl, %ah

with patched compiler:

movl%edi, %eax
movl%esi, %edx
addb%ah, %dh
movl%edx, %eax

[Bug middle-end/78904] zero-extracts are not effective

2016-12-22 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #5 from Uroš Bizjak  ---
Created attachment 40405
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40405=edit
Prototype patch for addqi_ext_[1,2] patterns

The prototype patch compiles the testcase to:

movl%edi, %edx
movl%esi, %eax
xorw%di, %di
addb%dh, %ah
movzwl  %ax, %eax
orl %edi, %eax

[Bug middle-end/78904] zero-extracts are not effective

2016-12-22 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #4 from Uroš Bizjak  ---
(In reply to Segher Boessenkool from comment #3)
> (In reply to Uroš Bizjak from comment #2)
> > No, unfortunately the above is not a valid x86 insn. x86 has two-operand
> > instructions, so output has to match one of the operands.
> 
> But these are pseudos.

Please note that

subreg:QI (zero_extract:SI (reg/v:SI 94 [ a ])
(const_int 8 [0x8])
(const_int 8 [0x8])) 0)

describes the high-part of a 16bit register, e.g. %ah, while

(reg:QI 88 [ _2 ])

describes low-part, e.g. %al

The pattern from the comment #1 models %al <- %ah + %bh, which is not a valid
x86 insn.

> > It looks that combine prefers:
> > 
> >(subreg:SI (plus:QI
> > and 
> >(subreg:QI (zero_extract:SI (... op ...)
> > 
> > while - according to the existing x86 patterns - in the past the patterns
> > were seemingly combined into:
> > 
> > (plus:SI
> > and
> >   (zero_extract:SI (... op ...)
> 
> So when did this change?
> 
> Combine in general prefers smaller modes.

I don't know when, perhaps with the rewrite of extzv pattern, when mode was
added to these patterns?

Anyway, I see that some patterns, e.g. *cmpqi_ext_1 are written using

(subreg:QI (zero_extract:SI ...)), which seems consistent with your observation
in Comment #1. Following your explanation, it looks that:

(set (zero_extract:SI (reg:SI 92 [ D.1804 ])
(const_int 8 [0x8])
(const_int 8 [0x8]))
(subreg:SI (_op_:QI ...

corresponds to setting of the highpart register (%ah), and consequently arith
and logic patterns in i386.md have to be rewritten.

[Bug middle-end/78904] zero-extracts are not effective

2016-12-22 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #3 from Segher Boessenkool  ---
(In reply to Uroš Bizjak from comment #2)
> No, unfortunately the above is not a valid x86 insn. x86 has two-operand
> instructions, so output has to match one of the operands.

But these are pseudos.

> It looks that combine prefers:
> 
>(subreg:SI (plus:QI
> and 
>(subreg:QI (zero_extract:SI (... op ...)
> 
> while - according to the existing x86 patterns - in the past the patterns
> were seemingly combined into:
> 
>   (plus:SI
> and
> (zero_extract:SI (... op ...)

So when did this change?

Combine in general prefers smaller modes.

[Bug middle-end/78904] zero-extracts are not effective

2016-12-22 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

--- Comment #2 from Uroš Bizjak  ---
(In reply to Segher Boessenkool from comment #1)
> ===
> Trying 10, 9 -> 11:
> Failed to match this instruction:
> (parallel [
> (set (reg:QI 88 [ _2 ]) 
> (plus:QI (subreg:QI (zero_extract:SI (reg/v:SI 94 [ a ])
> (const_int 8 [0x8])
> (const_int 8 [0x8])) 0)
> (subreg:QI (zero_extract:SI (reg/v:SI 95 [ b ])
> (const_int 8 [0x8])
> (const_int 8 [0x8])) 0)))
> (clobber (reg:CC 17 flags))
> ])
> ===
> 
> and then it later tries 10, 9 -> 11 and 9, 11 -> 13 etc., but you need
> 11, 10, 9 -> 13 to match your insn pattern.
> 
> Maybe you want a pattern for the result of 10, 9 -> 11?  It is a valid
> insn for i386 I think?
No, unfortunately the above is not a valid x86 insn. x86 has two-operand
instructions, so output has to match one of the operands.

It looks that combine prefers:

   (subreg:SI (plus:QI
and 
   (subreg:QI (zero_extract:SI (... op ...)

while - according to the existing x86 patterns - in the past the patterns were
seemingly combined into:

(plus:SI
and
  (zero_extract:SI (... op ...)

[Bug middle-end/78904] zero-extracts are not effective

2016-12-22 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78904

Segher Boessenkool  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-12-22
 Ever confirmed|0   |1

--- Comment #1 from Segher Boessenkool  ---
Before combine we have

===
(insn 9 7 10 2 (set (subreg:SI (reg:QI 97) 0)
(zero_extract:SI (reg/v:SI 95 [ b ])
(const_int 8 [0x8])
(const_int 8 [0x8]))) "78904.c":12 105 {*extzvsi}
 (expr_list:REG_DEAD (reg/v:SI 95 [ b ])
(nil)))
(insn 10 9 11 2 (set (subreg:SI (reg:QI 98 [ a$val ]) 0)
(zero_extract:SI (reg/v:SI 94 [ a ])
(const_int 8 [0x8])
(const_int 8 [0x8]))) "78904.c":12 105 {*extzvsi}
 (nil))
(insn 11 10 26 2 (parallel [
(set (reg:QI 88 [ _2 ])
(plus:QI (reg:QI 97)
(reg:QI 98 [ a$val ])))
(clobber (reg:CC 17 flags))
]) "78904.c":12 214 {*addqi_1}
 (expr_list:REG_DEAD (reg:QI 98 [ a$val ])
(expr_list:REG_DEAD (reg:QI 97)
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)
(insn 13 12 14 2 (set (zero_extract:SI (reg:SI 92 [ D.1801 ])
(const_int 8 [0x8])
(const_int 8 [0x8]))
(subreg:SI (reg:QI 88 [ _2 ]) 0)) "78904.c":14 109 {insvsi_1}
 (expr_list:REG_DEAD (reg:QI 88 [ _2 ])
(nil)))
===

and combine then does

===
Trying 10, 9 -> 11:
Failed to match this instruction:
(parallel [
(set (reg:QI 88 [ _2 ]) 
(plus:QI (subreg:QI (zero_extract:SI (reg/v:SI 94 [ a ])
(const_int 8 [0x8])
(const_int 8 [0x8])) 0)
(subreg:QI (zero_extract:SI (reg/v:SI 95 [ b ])
(const_int 8 [0x8])
(const_int 8 [0x8])) 0)))
(clobber (reg:CC 17 flags))
])
===

and then it later tries 10, 9 -> 11 and 9, 11 -> 13 etc., but you need
11, 10, 9 -> 13 to match your insn pattern.

Maybe you want a pattern for the result of 10, 9 -> 11?  It is a valid
insn for i386 I think?