https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125557

--- Comment #10 from ktkachov at gcc dot gnu.org ---
(In reply to Drea Pinski from comment #9)
> (In reply to Drea Pinski from comment #8)
> > (In reply to ktkachov from comment #7)
> > > (In reply to Drea Pinski from comment #6)
> > > > Note the final ifcvt part needs to have a decent cost model; right now 
> > > > your
> > > > patches don't have one and it is not obvious if it is always better
> > > > especially on targets which don't have a cmov (or cmov like 
> > > > instruction; e.g
> > > > riscv's czero.nez/czero.eqz). We need much more time on figuring that 
> > > > part
> > > > out. Either we should improve ifcvt or have something in the last 
> > > > phiopt.
> > > > Having it as part of sink breaks what sink is doing.
> > > 
> > > Yeah, that makes sense. With your latest patches at
> > > https://gcc.gnu.org/pipermail/gcc-patches/2026-June/720191.html we do get
> > > the right factoring on Snappy. RTL ifcvt then does the final ifcvt but 
> > > only
> > > at -O2. At -O3 -fsplit-paths messes up the basic blocks.
> > 
> > If split paths is messing it up, then poor_ifcvt_pred in
> > gimple-ssa-split-paths.cc should be tweaked more; I added this function in
> > 2024 (r15-3436-gb2b20b277988ab) to stop split pathes from splitting
> > ifcvtable things because it was doing exactly that.
> 
> Or split paths need to be moved later after phiopt. split paths has
> definitely been a trouble pass. It was also originally moved from rtl to
> gimple and when that was done it was moved right after the loop
> optimizations but before ifcvt. It was moved there because the idea was
> there could be some cleanups done after the patch was split. But I am not so
> sure.
> 
> Anyways see PR 68541, PR 112402, PR 120892.
> 
> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120892#c3 which talks about
> maybe removing it.
> 
> Maybe you can do some benchmarking of removing it? and if your data backs it
> up; I think proposing to removing it might be a good thing.

So it turns out -fsplit-paths messes up the if-conversion in the reduced
testcase here but doesn't do anything on the full Snappy workload. The
if-conversion there doesn't happen there for other reasons. I'll see if I can
extract a new testcase for that. But it looks like we'll still need to teach
something in the pipeline to do the if-conversion, be it RTL ifcvt or some
phiopt pass.

I did gather some data for aarch64 on SPEC2026 for -fsplit-paths and
-fno-split-paths

For a 1-iteration run at -O3 its effect seems to be in the noise, I am
rerunning for more iterations to confirm. The static data is:
* Successful Duplicating join block: 61,708
* declines due to spoiling ifcvt: 37,335
* declines due to no CSE/DCE/jump-thread opportunity: 21,620
* declines due to too many stmts: 1,955
* Overall increase in instructions: 0.401%
* Overall increase in .text: 0.342%

I'll try with LTO as well and get 3-iterations results too.

Reply via email to