On Tue, Jun 20, 2023 at 10:15:37AM +0200, Richard Biener wrote:
> On Mon, Jun 19, 2023 at 9:45 PM Jakub Jelinek via Gcc <gcc@gcc.gnu.org> wrote:
> >
> > On Mon, Jun 19, 2023 at 09:10:53PM +0200, André Günther via Gcc wrote:
> > > I noticed that a simple function like
> > > auto relu( float x ) {
> > >     return x > 0.f ? x : 0.f;
> > > }
> > > compiles to different ASM using GCC11 (or lower) and GCC12 (or higher). On
> > > -O3 -mavx2 the former compiles above function to
> >
> > Such reports should go into gcc.gnu.org/bugzilla/, not to the mailing list,
> > if you are convinced that loading the constant from memory is faster.
> > Another possibility is
> >         vxorps xmm1, xmm1, xmm1
> >         vmaxss xmm0, xmm0, xmm1
> >         ret
> > which doesn't need to wait for the memory.
> > This changed with https://gcc.gnu.org/r12-7693
> 
> I guess we previously were able to see that one operand of
> the comparison was not NaN.  Maybe some REG_EQUAL
> note can come to the rescue here?

ce1 pass results in emit_conditional_move with
(gt (reg/v:SF 83 [ x ]) (reg:SF 84)), (reg/v:SF 83 [ x ]), (reg:SF 84)
operands in the GCC 11 case and so is successfully matched by
ix86_expand_fp_movcc as ix86_expand_sse_fp_minmax.
But, in GCC 12+, emit_conditional_move is called with
(gt (reg/v:SF 83 [ x ]) (reg:SF 84)), (reg/v:SF 83 [ x ]), (const_double:SF 0.0 
[0x0.0p+0])
instead (reg:SF 84 in both cases contains the (const_double:SF 0.0 [0x0.0p+0])
value, in the GCC 11 case loaded from memory, in the GCC 12+ case set
directly in a move.  The reason for the difference is exactly that
because cheap SSE constant can be moved directly into a reg, it is done so
instead of reusing a pseudo that contains that value already.
In the latter case ix86_expand_fp_movcc is called even not with the
const_double because the expander doesn't allow immediates, but with
it forced into some other register, so it can't really find out it is
actually a minmax.  Even if it allowed the cheap SSE constants,
it wouldn't know that r84 is also zero (unless the expander checks that
it is a pseudo with a single setter and verifies it or something similar).

        Jakub

Reply via email to