On Wed, Aug 2, 2023 at 4:08 PM Manolis Tsamis <manolis.tsa...@vrull.eu> wrote:
>
> Hi all,
>
> I'm pinging to discuss again if we want to move this forward for GCC14.
>
> I did some testing again and I haven't been able to find obvious
> regressions, including testing the code from PR86270 and PR70359 that
> Richard mentioned.
> I still believe that zero can be considered a special case even for
> hardware that doesn't directly benefit in the comparison.
> For example it happens that the testcase from the commit compiles to
> one instruction less in x86:
>
> .LFB0:
>     movl    (%rdi), %eax
>     leal    1(%rax), %edx
>     movl    %edx, (%rdi)
>     testl    %eax, %eax
>     je    .L4
>     ret
> .L4:
>     jmp    g
>
> vs
>
> .LFB0:
>     movl    (%rdi), %eax
>     addl    $1, %eax
>     movl    %eax, (%rdi)
>     cmpl    $1, %eax
>     je    .L4
>     ret
> .L4:
>     xorl    %eax, %eax
>     jmp    g
>
> (The xorl is not emitted  when testl is used. LLVM uses testl but also
> does xor eax, eax :) )
> Although this is accidental, I believe it also showcases that zero is
> a preferential value in various ways.
>
> I'm running benchmarks comparing the effects of this change and I'm
> also still looking for testcases that result in problematic
> regressions.
> Any feedback or other concerns about this are appreciated!

My comment from Apr 24th still holds, IMO this is something for
instruction selection (aka the ISEL pass) or the out-of-SSA tweaks
we do during RTL expansion (see insert_backedge_copies)

Richard.

> Thanks,
> Manolis
>
> On Wed, Apr 26, 2023 at 9:43 AM Richard Biener
> <richard.guent...@gmail.com> wrote:
> >
> > On Wed, Apr 26, 2023 at 4:30 AM Jeff Law <jeffreya...@gmail.com> wrote:
> > >
> > >
> > >
> > > On 4/25/23 01:21, Richard Biener wrote:
> > > > On Tue, Apr 25, 2023 at 1:05 AM Jeff Law <jeffreya...@gmail.com> wrote
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On 4/24/23 02:06, Richard Biener via Gcc-patches wrote:
> > > >>> On Fri, Apr 21, 2023 at 11:01 PM Philipp Tomsich
> > > >>> <philipp.toms...@vrull.eu> wrote:
> > > >>>>
> > > >>>> Any guidance on the next steps for this patch?
> > > >>>
> > > >>> I think we want to perform this transform later, in particular when
> > > >>> the test is a loop exit test we do not want to do it as it prevents
> > > >>> coalescing of the IV on the backedge at out-of-SSA time.
> > > >>>
> > > >>> That means doing the transform in folding and/or before inlining
> > > >>> (the test could become a loop exit test) would be a no-go.  In fact
> > > >>> for SSA coalescing we'd want the reverse transform in some cases, see
> > > >>> PRs 86270 and 70359.
> > > >>>
> > > >>> If we can reliably undo for the loop case I suppose we can do the
> > > >>> canonicalization to compare against zero.  In any case please split
> > > >>> up the patch (note
> > > >> I've also
> > > >>> hoped we could eventually get rid of that part of
> > > >>> tree-ssa-forwprop.cc
> > > >> in favor
> > > >>> of match.pd patterns since it uses GENERIC folding :/).
> > > >>>
> > > >> Do we have enough information to do this at expansion time?  That would
> > > >> avoid introducing the target dependencies to drive this in gimple.
> > > >
> > > > I think so, but there isn't any convenient place to do this I think.  I 
> > > > suppose
> > > > there's no hope to catch it during RTL opts?
> > > Combine would be the most natural place in the RTL pipeline, but it'd be
> > > a 2->2 combination which would be rejected.
> > >
> > > We could possibly do it as a define_insn_and_split, but the gimple->RTL
> > > interface seems like a better fit to me.  If TER has done its job, we
> > > should see a complex enough tree node to do the right thing.
> >
> > Of course we'd want to get rid of TER in favor of ISEL
> >
> > Richard.
> >
> > > jeff

Reply via email to