Re: [PATCH v1] [RFC] Improve folding for comparisons with zero in tree-ssa-forwprop.

Manolis Tsamis Wed, 02 Aug 2023 07:08:40 -0700

Hi all,

I'm pinging to discuss again if we want to move this forward for GCC14.


I did some testing again and I haven't been able to find obvious
regressions, including testing the code from PR86270 and PR70359 that
Richard mentioned.
I still believe that zero can be considered a special case even for
hardware that doesn't directly benefit in the comparison.
For example it happens that the testcase from the commit compiles to
one instruction less in x86:

.LFB0:
    movl    (%rdi), %eax
    leal    1(%rax), %edx
    movl    %edx, (%rdi)
    testl    %eax, %eax
    je    .L4
    ret
.L4:
    jmp    g

vs

.LFB0:
    movl    (%rdi), %eax
    addl    $1, %eax
    movl    %eax, (%rdi)
    cmpl    $1, %eax
    je    .L4
    ret
.L4:
    xorl    %eax, %eax
    jmp    g

(The xorl is not emitted  when testl is used. LLVM uses testl but also
does xor eax, eax :) )
Although this is accidental, I believe it also showcases that zero is
a preferential value in various ways.

I'm running benchmarks comparing the effects of this change and I'm
also still looking for testcases that result in problematic
regressions.
Any feedback or other concerns about this are appreciated!

Thanks,
Manolis

On Wed, Apr 26, 2023 at 9:43 AM Richard Biener
<[email protected]> wrote:
>
> On Wed, Apr 26, 2023 at 4:30 AM Jeff Law <[email protected]> wrote:
> >
> >
> >
> > On 4/25/23 01:21, Richard Biener wrote:
> > > On Tue, Apr 25, 2023 at 1:05 AM Jeff Law <[email protected]> wrote
> > >>
> > >>
> > >>
> > >>
> > >> On 4/24/23 02:06, Richard Biener via Gcc-patches wrote:
> > >>> On Fri, Apr 21, 2023 at 11:01 PM Philipp Tomsich
> > >>> <[email protected]> wrote:
> > >>>>
> > >>>> Any guidance on the next steps for this patch?
> > >>>
> > >>> I think we want to perform this transform later, in particular when
> > >>> the test is a loop exit test we do not want to do it as it prevents
> > >>> coalescing of the IV on the backedge at out-of-SSA time.
> > >>>
> > >>> That means doing the transform in folding and/or before inlining
> > >>> (the test could become a loop exit test) would be a no-go.  In fact
> > >>> for SSA coalescing we'd want the reverse transform in some cases, see
> > >>> PRs 86270 and 70359.
> > >>>
> > >>> If we can reliably undo for the loop case I suppose we can do the
> > >>> canonicalization to compare against zero.  In any case please split
> > >>> up the patch (note
> > >> I've also
> > >>> hoped we could eventually get rid of that part of
> > >>> tree-ssa-forwprop.cc
> > >> in favor
> > >>> of match.pd patterns since it uses GENERIC folding :/).
> > >>>
> > >> Do we have enough information to do this at expansion time?  That would
> > >> avoid introducing the target dependencies to drive this in gimple.
> > >
> > > I think so, but there isn't any convenient place to do this I think.  I 
> > > suppose
> > > there's no hope to catch it during RTL opts?
> > Combine would be the most natural place in the RTL pipeline, but it'd be
> > a 2->2 combination which would be rejected.
> >
> > We could possibly do it as a define_insn_and_split, but the gimple->RTL
> > interface seems like a better fit to me.  If TER has done its job, we
> > should see a complex enough tree node to do the right thing.
>
> Of course we'd want to get rid of TER in favor of ISEL
>
> Richard.
>
> > jeff

Re: [PATCH v1] [RFC] Improve folding for comparisons with zero in tree-ssa-forwprop.

Reply via email to