[Bug target/95632] Redundant zero extension

2022-12-27 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95632

Jeffrey A. Law  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Jeffrey A. Law  ---
Fixed on the trunk.

[Bug target/95632] Redundant zero extension

2022-12-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95632

--- Comment #7 from CVS Commits  ---
The master branch has been updated by Jeff Law :

https://gcc.gnu.org/g:2e886eef7f2b5aadb00171af868f0895b647c3a4

commit r13-4907-g2e886eef7f2b5aadb00171af868f0895b647c3a4
Author: Raphael Moreira Zinsly 
Date:   Tue Dec 27 18:29:25 2022 -0500

RISC-V: Produce better code with complex constants [PR95632] [PR106602]

gcc/Changelog:
PR target/95632
PR target/106602
* config/riscv/riscv.md: New pattern to simulate complex
const_int loads.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr95632.c: New test.
* gcc.target/riscv/pr106602.c: New test.

[Bug target/95632] Redundant zero extension

2021-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95632

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug target/95632] Redundant zero extension

2020-06-16 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95632

--- Comment #6 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #5)
> (In reply to Mel Chen from comment #2)
> > Is it possible to pretend that we have a pattern that can match xor (reg:SI
> > 80), (reg: SI 72), 0xa001 in combine pass?
> > And then, if the constant part is too large to put in to the immediate part,
> > it can be split to 2 xor in split pass.
> 
> Please note that the combine pass has its own (rather limited) splitter, it
> is documented in the second part of "Defining How to Split Instructions"
> paragraph. The example is dealing with the instruction that has too large
> immediate part, and looks similar to your problem.

Oh, I missed the discussion above. In this case, x86 implements pre-reload
splits, please see patterns decorated with ix86_pre_reload_split condition.

[Bug target/95632] Redundant zero extension

2020-06-16 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95632

--- Comment #5 from Uroš Bizjak  ---
(In reply to Mel Chen from comment #2)
> Is it possible to pretend that we have a pattern that can match xor (reg:SI
> 80), (reg: SI 72), 0xa001 in combine pass?
> And then, if the constant part is too large to put in to the immediate part,
> it can be split to 2 xor in split pass.

Please note that the combine pass has its own (rather limited) splitter, it is
documented in the second part of "Defining How to Split Instructions"
paragraph. The example is dealing with the instruction that has too large
immediate part, and looks similar to your problem.

[Bug target/95632] Redundant zero extension

2020-06-15 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95632

--- Comment #4 from Jim Wilson  ---
Created attachment 48737
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48737=edit
proof of concept patch for changing xor with a large constant

needs cleanup and testing to be useful

[Bug target/95632] Redundant zero extension

2020-06-15 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95632

--- Comment #3 from Jim Wilson  ---
It isn't possible to have patterns that match only in combine.  If we add a
pattern to accept (xor (reg) (large constant)) then it could match in any
optimization pass, and could prevent us from optimizing away redundant lui
instructions.

There is a representation issue here with constants.  If we split them early,
then optimizing redundant lui is easy.  If we split them late, then optimizing
redundant lui is hard.  There are also other optimizations that may be easy or
hard depending on whether constants are split early or late.  Currently, we
always split constants early, and changing that will have a major impact on the
code optimization, which may be good or bad, but more likely will be good for
some programs and bad for others.  I'd rather not change this as it will be a
major project to deal with the problems caused by the change.

Hence my suggestion at RTL generation time to split xor with constants
differently.  I have a proof of concept patch for that, but it needs a lot of
cleanup to be useful, and a lot of testing to verify that it improves code more
often than it harms code.

As for ree, splitters after register allocation traditionally check
reload_completed which is a global variable set near the end of the last
register allocation pass.  The split2 pass happens between reload and ree. 
Maybe moving ree before split2 would help RISC-V, but might hurt other targets.
 Or might help for some programs and hurt for others.

[Bug target/95632] Redundant zero extension

2020-06-15 Thread bina2374 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95632

--- Comment #2 from Mel Chen  ---
(In reply to Jim Wilson from comment #1)
> We sign extend HImode constants as that is the natural thing to do to make
> arithmetic work.  This does mean that unsigned short logical operations need
> a zero extend after the operation which might otherwise be unnecessary. 
> This can't be handled at rtl generation time as we don't know if the
> constant will be used for arithmetic or logicals or signed or unsigned.  But
> maybe an optimization pass could go over the code and convert HImode
> constants to signed or unsigned as appropriate to reduce the number of
> sign/zero extend operations.  We have the ree pass that we might be able to
> extend to handle this.

Extend ree pass is a good way, but now it seems only scanning XXX_extend.
Because the zero_extend has been split to 2 shift instructions before ree pass,
do we need to keep zero_extend until ree pass? Or is there any other way to
know that the shift pair was a zero_extend?
> 
> Handling this in combine requires a 4->3 splitter which is something combine
> doesn't do.  We could work around that by not splitting constants before
> combine, but that would be a major change and probably not beneficial, as we
> wouldn't be able to easily optimize the high part of the constants anymore.

I agree. This way is a bit risky.
> 
> Another approach here might be to split the xor along with the constant.  If
> we generated something like
>   srlia0,a0,1
> xoria0,a0,1
>   li  a5,-24576
>   xor a0,a0,a5
> then we can optimize away the following zero extend with a 3->2 splitter
> which combine already supports via find_split_point.  We can still optimize
> the high part of the constant. Since the immediates are sign extended, if
> the low part of the immediate has the sign bit set, we would have to invert
> the high part of the immediate to get the right result.  At least I think
> that works, I haven't double checked it yet.  This only works for or if the
> low part doesn't have the sign bit set.  And this only works for and if the
> low part does have the sign bit set.

I'm not sure how difficult it is to split 1 xor to 2 xor before combine pass,
but I have another proposal:

The following dump is combine dump:
Trying 8, 9, 10 -> 11:
8: r79:SI=0xa000
9: r78:SI=r79:SI+0x1
  REG_DEAD r79:SI
  REG_EQUAL 0xa001
   10: r77:SI=r72:SI^r78:SI
  REG_DEAD r78:SI
  REG_DEAD r72:SI
   11: r80:SI=zero_extend(r77:SI#0)
  REG_DEAD r77:SI
Failed to match this instruction:
(set (reg:SI 80)
(xor:SI (reg:SI 72 [ _4 ])
(const_int 40961 [0xa001])))

Is it possible to pretend that we have a pattern that can match xor (reg:SI
80), (reg: SI 72), 0xa001 in combine pass?
And then, if the constant part is too large to put in to the immediate part, it
can be split to 2 xor in split pass.

[Bug target/95632] Redundant zero extension

2020-06-11 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95632

Jim Wilson  changed:

   What|Removed |Added

   Last reconfirmed||2020-06-12
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Jim Wilson  ---
We sign extend HImode constants as that is the natural thing to do to make
arithmetic work.  This does mean that unsigned short logical operations need a
zero extend after the operation which might otherwise be unnecessary.  This
can't be handled at rtl generation time as we don't know if the constant will
be used for arithmetic or logicals or signed or unsigned.  But maybe an
optimization pass could go over the code and convert HImode constants to signed
or unsigned as appropriate to reduce the number of sign/zero extend operations.
 We have the ree pass that we might be able to extend to handle this.

Handling this in combine requires a 4->3 splitter which is something combine
doesn't do.  We could work around that by not splitting constants before
combine, but that would be a major change and probably not beneficial, as we
wouldn't be able to easily optimize the high part of the constants anymore.

Another approach here might be to split the xor along with the constant.  If we
generated something like
srlia0,a0,1
xoria0,a0,1
li  a5,-24576
xor a0,a0,a5
then we can optimize away the following zero extend with a 3->2 splitter which
combine already supports via find_split_point.  We can still optimize the high
part of the constant. Since the immediates are sign extended, if the low part
of the immediate has the sign bit set, we would have to invert the high part of
the immediate to get the right result.  At least I think that works, I haven't
double checked it yet.  This only works for or if the low part doesn't have the
sign bit set.  And this only works for and if the low part does have the sign
bit set.