On Wed, Aug 2, 2023 at 4:08 PM Manolis Tsamis <manolis.tsa...@vrull.eu> wrote: > > Hi all, > > I'm pinging to discuss again if we want to move this forward for GCC14. > > I did some testing again and I haven't been able to find obvious > regressions, including testing the code from PR86270 and PR70359 that > Richard mentioned. > I still believe that zero can be considered a special case even for > hardware that doesn't directly benefit in the comparison. > For example it happens that the testcase from the commit compiles to > one instruction less in x86: > > .LFB0: > movl (%rdi), %eax > leal 1(%rax), %edx > movl %edx, (%rdi) > testl %eax, %eax > je .L4 > ret > .L4: > jmp g > > vs > > .LFB0: > movl (%rdi), %eax > addl $1, %eax > movl %eax, (%rdi) > cmpl $1, %eax > je .L4 > ret > .L4: > xorl %eax, %eax > jmp g > > (The xorl is not emitted when testl is used. LLVM uses testl but also > does xor eax, eax :) ) > Although this is accidental, I believe it also showcases that zero is > a preferential value in various ways. > > I'm running benchmarks comparing the effects of this change and I'm > also still looking for testcases that result in problematic > regressions. > Any feedback or other concerns about this are appreciated!
My comment from Apr 24th still holds, IMO this is something for instruction selection (aka the ISEL pass) or the out-of-SSA tweaks we do during RTL expansion (see insert_backedge_copies) Richard. > Thanks, > Manolis > > On Wed, Apr 26, 2023 at 9:43 AM Richard Biener > <richard.guent...@gmail.com> wrote: > > > > On Wed, Apr 26, 2023 at 4:30 AM Jeff Law <jeffreya...@gmail.com> wrote: > > > > > > > > > > > > On 4/25/23 01:21, Richard Biener wrote: > > > > On Tue, Apr 25, 2023 at 1:05 AM Jeff Law <jeffreya...@gmail.com> wrote > > > >> > > > >> > > > >> > > > >> > > > >> On 4/24/23 02:06, Richard Biener via Gcc-patches wrote: > > > >>> On Fri, Apr 21, 2023 at 11:01 PM Philipp Tomsich > > > >>> <philipp.toms...@vrull.eu> wrote: > > > >>>> > > > >>>> Any guidance on the next steps for this patch? > > > >>> > > > >>> I think we want to perform this transform later, in particular when > > > >>> the test is a loop exit test we do not want to do it as it prevents > > > >>> coalescing of the IV on the backedge at out-of-SSA time. > > > >>> > > > >>> That means doing the transform in folding and/or before inlining > > > >>> (the test could become a loop exit test) would be a no-go. In fact > > > >>> for SSA coalescing we'd want the reverse transform in some cases, see > > > >>> PRs 86270 and 70359. > > > >>> > > > >>> If we can reliably undo for the loop case I suppose we can do the > > > >>> canonicalization to compare against zero. In any case please split > > > >>> up the patch (note > > > >> I've also > > > >>> hoped we could eventually get rid of that part of > > > >>> tree-ssa-forwprop.cc > > > >> in favor > > > >>> of match.pd patterns since it uses GENERIC folding :/). > > > >>> > > > >> Do we have enough information to do this at expansion time? That would > > > >> avoid introducing the target dependencies to drive this in gimple. > > > > > > > > I think so, but there isn't any convenient place to do this I think. I > > > > suppose > > > > there's no hope to catch it during RTL opts? > > > Combine would be the most natural place in the RTL pipeline, but it'd be > > > a 2->2 combination which would be rejected. > > > > > > We could possibly do it as a define_insn_and_split, but the gimple->RTL > > > interface seems like a better fit to me. If TER has done its job, we > > > should see a complex enough tree node to do the right thing. > > > > Of course we'd want to get rid of TER in favor of ISEL > > > > Richard. > > > > > jeff