> The most obvious places would be right after SSA construction and before RTL 
> expansion.
> Can you provide measurements for those positions?

The algorithm should only remove PHIs that break SSA form minimality. Since
GCC's SSA construction already produces minimal SSA form, the algorithm isn't
expected to remove any PHIs if run right after the construction. I even
measured it and indeed -- no PHIs got removed (except for 502.gcc_r, where the
algorithm managed to remove exactly 1 PHI, which is weird). 

I tried putting the pass before pass_expand. There isn't a lot of PHIs to
remove at that point, but there still are some.

500.perlbench_r
Started with 43111
Ended with 42942
Removed PHI % .39201131961680313700

502.gcc_r
Started with 141392
Ended with 140455
Removed PHI % .66269661649881181400

505.mcf_r
Started with 482
Ended with 478
Removed PHI % .82987551867219917100

523.xalancbmk_r
Started with 136040
Ended with 135629
Removed PHI % .30211702440458688700

531.deepsjeng_r
Started with 2150
Ended with 2148
Removed PHI % .09302325581395348900

541.leela_r
Started with 4664
Ended with 4650
Removed PHI % .30017152658662092700

557.xz_r
Started with 43
Ended with 43
Removed PHI % 0

> Can the pass somehow be used as part of propagations like during value 
> numbering?

I don't think that the pass could be used as a part of different optimizations
since it works on the whole CFG (except for copy propagation as I noted in the
RFC). I'm adding Honza into Cc. He'll have more insight into this.

> Could the new file be called gimple-ssa-sccp.cc or something similar?

Certainly. Though I'm not sure, but wouldn't tree-ssa-sccp.cc be more
appropriate?

I'm thinking about naming the pass 'scc-copy' and the file
'tree-ssa-scc-copy.cc'.

> Removing some PHIs is nice, but it would be also interesting to know what
> are the effects on generated code size and/or performance.
> And also if it has any effects on debug information coverage.

Regarding performance: I ran some benchmarks on a Zen3 machine with -O3 with
and without the new pass. *I got ~2% speedup for 505.mcf_r and 541.leela_r.
Here are the full results. What do you think? Should I run more benchmarks? Or
benchmark multiple times? Or run the benchmarks on different machines?*

500.perlbench_r    
Without SCCP: 244.151807s    
With SCCP: 242.448438s    
-0.7025695913124297%    
    
502.gcc_r    
Without SCCP: 211.029606s    
With SCCP: 211.614523s    
+0.27640683243653763%    
    
505.mcf_r    
Without SCCP: 298.782621s    
With SCCP: 291.671468s    
-2.438069465197046%    
    
523.xalancbmk_r    
Without SCCP: 189.940639s    
With SCCP: 189.876261s    
-0.03390523894928332%    
    
531.deepsjeng_r    
Without SCCP: 250.63648s    
With SCCP: 250.988624s    
+0.1403027732444051%    
    
541.leela_r    
Without SCCP: 346.066278s    
With SCCP: 339.692987s    
-1.8761915152519792%

Regarding size: The pass doesn't seem to significantly reduce or increase the
size of the result binary. The differences were at most ~0.1%.

Regarding debug info coverage: I didn't notice any additional guality testcases
failing after I applied the patch. *Is there any other way how I should check
debug info coverage?*


Filip K

Reply via email to