Hi,

This is an update for our previous discussion. 

1. I implemented the following two different implementations in the latest 
upstream gcc:

A. Adding real initialization during gimplification, not maintain the 
uninitialized warnings.

D. Adding  calls to .DEFFERED_INIT during gimplification, expand the 
.DEFFERED_INIT during expand to
 real initialization. Adjusting uninitialized pass with the new refs with 
“.DEFFERED_INIT”.

Note, in this initial implementation,
        ** I ONLY implement -ftrivial-auto-var-init=zero, the implementation of 
-ftrivial-auto-var-init=pattern 
           is not done yet.  Therefore, the performance data is only about 
-ftrivial-auto-var-init=zero. 

        ** I added an temporary  option -fauto-var-init-approach=A|B|C|D  to 
choose implementation A or D for 
           runtime performance study.
        ** I didn’t finish the uninitialized warnings maintenance work for D. 
(That might take more time than I expected). 

2. I collected runtime data for CPU2017 on a x86 machine with this new gcc for 
the following 3 cases:

no: default. (-g -O2 -march=native )
A:  default +  -ftrivial-auto-var-init=zero -fauto-var-init-approach=A 
D:  default +  -ftrivial-auto-var-init=zero -fauto-var-init-approach=D 

And then compute the slowdown data for both A and D as following:

benchmarks              A / no  D /no
                        
500.perlbench_r 1.25%   1.25%
502.gcc_r               0.68%   1.80%
505.mcf_r               0.68%   0.14%
520.omnetpp_r   4.83%   4.68%
523.xalancbmk_r 0.18%   1.96%
525.x264_r              1.55%   2.07%
531.deepsjeng_  11.57%  11.85%
541.leela_r             0.64%   0.80%
557.xz_                  -0.41% -0.41%
                        
507.cactuBSSN_r 0.44%   0.44%
508.namd_r              0.34%   0.34%
510.parest_r            0.17%   0.25%
511.povray_r            56.57%  57.27%
519.lbm_r               0.00%   0.00%
521.wrf_r                        -0.28% -0.37%
526.blender_r           16.96%  17.71%
527.cam4_r              0.70%   0.53%
538.imagick_r           2.40%   2.40%
544.nab_r               0.00%   -0.65%

avg                             5.17%   5.37%

From the above data, we can see that in general, the runtime performance 
slowdown for 
implementation A and D are similar for individual benchmarks.

There are several benchmarks that have significant slowdown with the new added 
initialization for both
A and D, for example, 511.povray_r, 526.blender_, and 531.deepsjeng_r, I will 
try to study a little bit
more on what kind of new initializations introduced such slowdown. 

From the current study so far, I think that approach D should be good enough 
for our final implementation. 
So, I will try to finish approach D with the following remaining work

      ** complete the implementation of -ftrivial-auto-var-init=pattern;
      ** complete the implementation of uninitialized warnings maintenance work 
for D. 


Let me know if you have any comments and suggestions on my current and future 
work.

Thanks a lot for your help.

Qing

> On Dec 9, 2020, at 10:18 AM, Qing Zhao via Gcc-patches 
> <gcc-patches@gcc.gnu.org> wrote:
> 
> The following are the approaches I will implement and compare:
> 
> Our final goal is to keep the uninitialized warning and minimize the run-time 
> performance cost.
> 
> A. Adding real initialization during gimplification, not maintain the 
> uninitialized warnings.
> B. Adding real initialization during gimplification, marking them with 
> “artificial_init”. 
>     Adjusting uninitialized pass, maintaining the annotation, making sure the 
> real init not
>     Deleted from the fake init. 
> C.  Marking the DECL for an uninitialized auto variable as “no_explicit_init” 
> during gimplification,
>      maintain this “no_explicit_init” bit till after 
> pass_late_warn_uninitialized, or till pass_expand, 
>      add real initialization for all DECLs that are marked with 
> “no_explicit_init”.
> D. Adding .DEFFERED_INIT during gimplification, expand the .DEFFERED_INIT 
> during expand to
>     real initialization. Adjusting uninitialized pass with the new refs with 
> “.DEFFERED_INIT”.
> 
> 
> In the above, approach A will be the one that have the minimum run-time cost, 
> will be the base for the performance
> comparison. 
> 
> I will implement approach D then, this one is expected to have the most 
> run-time overhead among the above list, but
> Implementation should be the cleanest among B, C, D. Let’s see how much more 
> performance overhead this approach
> will be. If the data is good, maybe we can avoid the effort to implement B, 
> and C. 
> 
> If the performance of D is not good, I will implement B or C at that time.
> 
> Let me know if you have any comment or suggestions.
> 
> Thanks.
> 
> Qing

Reply via email to