> > while working on the GCN port I ended up with many redundant register copies
> > of the form
> >  mov reg, exec
> >  do something
> >  mov reg, exec
> >  do something
> >  ...
> > these copies are generated by LRA because exec is small register class and
> > needs a lot of reloading (it could be improved too, but I do not care
> > because I want to handle exec specially later anyway).
> > 
> > I was however suprised this garbage survives postreload optimizations.  It
> > is easy to fix in regcprop which already does some noop copy elimination,
> > but only of the for mov reg, reg after substituting.
> Right, this ought to be dealt with during postreload CSE, there is roughly 
> the 
> same code as yours:
> /* See whether a single set SET is a noop.  */
> static int
> reload_cse_noop_set_p (rtx set)
> {
>   if (cselib_reg_set_mode (SET_DEST (set)) != GET_MODE (SET_DEST (set)))
>     return 0;
>   return rtx_equal_for_cselib_p (SET_DEST (set), SET_SRC (set));
> }
> Any idea about why this doesn't work in your case?

Thanks for pointing that code out.  I looked for noop_set in postreload
passes and found one in regcprop first.

My case is not optimized because my IRA move pattern contains an use that is
not handled by postreload cse.  I am testing the attached patch and plan
commit it as obvious.  

regcprop uses single_set which may be bit more natural, but it won't be stronger
here because REG_DEAD notes are not computed, yet, and the current way noop
moves are discovered runs in parallel with simplification which is probably
a bit cheaper.

I think elimination at both places makes sense:  postreload cse is run before
splitting and regcprop afterwards. It seems that at least for x86 we get quite
few noop moves by splitting.  I will get statistics from x86_64 bootstrap for
regcprop part of elimination.

        * postreload.c (reload_cse_simplify): Also accept USE in noop move

diff --git a/gcc/postreload.c b/gcc/postreload.c
index 61c1ce8..4f3a526 100644
--- a/gcc/postreload.c
+++ b/gcc/postreload.c
@@ -153,7 +153,8 @@ reload_cse_simplify (rtx_insn *insn, rtx testreg)
                  value = SET_DEST (part);
-         else if (GET_CODE (part) != CLOBBER)
+         else if (GET_CODE (part) != CLOBBER
+                  && GET_CODE (part) != USE)

