Re: The performance data for two different implementation of new security feature -ftrivial-auto-var-init

Qing Zhao via Gcc-patches Wed, 13 Jan 2021 07:36:26 -0800

> On Jan 13, 2021, at 9:10 AM, Richard Biener <rguent...@suse.de> wrote:
> 
> On Wed, 13 Jan 2021, Qing Zhao wrote:
> 
>> 
>> 
>>> On Jan 13, 2021, at 1:39 AM, Richard Biener <rguent...@suse.de> wrote:
>>> 
>>> On Tue, 12 Jan 2021, Qing Zhao wrote:
>>> 
>>>> Hi, 
>>>> 
>>>> Just check in to see whether you have any comments and suggestions on this:
>>>> 
>>>> FYI, I have been continue with Approach D implementation since last week:
>>>> 
>>>> D. Adding  calls to .DEFFERED_INIT during gimplification, expand the 
>>>> .DEFFERED_INIT during expand to
>>>> real initialization. Adjusting uninitialized pass with the new refs with 
>>>> “.DEFFERED_INIT”.
>>>> 
>>>> For the remaining work of Approach D:
>>>> 
>>>> ** complete the implementation of -ftrivial-auto-var-init=pattern;
>>>> ** complete the implementation of uninitialized warnings maintenance work 
>>>> for D. 
>>>> 
>>>> I have completed the uninitialized warnings maintenance work for D.
>>>> And finished partial of the -ftrivial-auto-var-init=pattern 
>>>> implementation. 
>>>> 
>>>> The following are remaining work of Approach D:
>>>> 
>>>>  ** -ftrivial-auto-var-init=pattern for VLA;
>>>>  **add a new attribute for variable:
>>>> __attribute((uninitialized)
>>>> the marked variable is uninitialized intentionaly for performance purpose.
>>>>  ** adding complete testing cases;
>>>> 
>>>> 
>>>> Please let me know if you have any objection on my current decision on 
>>>> implementing approach D. 
>>> 
>>> Did you do any analysis on how stack usage and code size are changed 
>>> with approach D?
>> 
>> I did the code size change comparison (I will provide the data in another 
>> email). And with this data, D works better than A in general. (This is 
>> surprise to me actually).
>> 
>> But not the stack usage.  Not sure how to collect the stack usage data, 
>> do you have any suggestion on this?
> 
> There is -fstack-usage you could use, then of course watching
> the stack segment at runtime.

I can do this for CPU2017 to collect the stack usage data and report back.

>  I'm mostly concerned about
> stack-limited "processes" such as the linux kernel which I think
> is a primary target of your work.

I don’t have any experience on building linux kernel. 
Do we have to collect data for linux kernel at this time? Is CPU2017 data not 
enough?

Qing
> 
> Richard.
> 
>> 
>>> How does compile-time behave (we could gobble up
>>> lots of .DEFERRED_INIT calls I guess)?
>> I can collect this data too and report it later.
>> 
>> Thanks.
>> 
>> Qing
>>> 
>>> Richard.
>>> 
>>>> Thanks a lot for your help.
>>>> 
>>>> Qing
>>>> 
>>>> 
>>>>> On Jan 5, 2021, at 1:05 PM, Qing Zhao via Gcc-patches 
>>>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> This is an update for our previous discussion. 
>>>>> 
>>>>> 1. I implemented the following two different implementations in the 
>>>>> latest upstream gcc:
>>>>> 
>>>>> A. Adding real initialization during gimplification, not maintain the 
>>>>> uninitialized warnings.
>>>>> 
>>>>> D. Adding  calls to .DEFFERED_INIT during gimplification, expand the 
>>>>> .DEFFERED_INIT during expand to
>>>>> real initialization. Adjusting uninitialized pass with the new refs with 
>>>>> “.DEFFERED_INIT”.
>>>>> 
>>>>> Note, in this initial implementation,
>>>>>   ** I ONLY implement -ftrivial-auto-var-init=zero, the implementation of 
>>>>> -ftrivial-auto-var-init=pattern 
>>>>>      is not done yet.  Therefore, the performance data is only about 
>>>>> -ftrivial-auto-var-init=zero. 
>>>>> 
>>>>>   ** I added an temporary  option -fauto-var-init-approach=A|B|C|D  to 
>>>>> choose implementation A or D for 
>>>>>      runtime performance study.
>>>>>   ** I didn’t finish the uninitialized warnings maintenance work for D. 
>>>>> (That might take more time than I expected). 
>>>>> 
>>>>> 2. I collected runtime data for CPU2017 on a x86 machine with this new 
>>>>> gcc for the following 3 cases:
>>>>> 
>>>>> no: default. (-g -O2 -march=native )
>>>>> A:  default +  -ftrivial-auto-var-init=zero -fauto-var-init-approach=A 
>>>>> D:  default +  -ftrivial-auto-var-init=zero -fauto-var-init-approach=D 
>>>>> 
>>>>> And then compute the slowdown data for both A and D as following:
>>>>> 
>>>>> benchmarks                A / no  D /no
>>>>> 
>>>>> 500.perlbench_r   1.25%   1.25%
>>>>> 502.gcc_r         0.68%   1.80%
>>>>> 505.mcf_r         0.68%   0.14%
>>>>> 520.omnetpp_r     4.83%   4.68%
>>>>> 523.xalancbmk_r   0.18%   1.96%
>>>>> 525.x264_r                1.55%   2.07%
>>>>> 531.deepsjeng_    11.57%  11.85%
>>>>> 541.leela_r               0.64%   0.80%
>>>>> 557.xz_                    -0.41% -0.41%
>>>>> 
>>>>> 507.cactuBSSN_r   0.44%   0.44%
>>>>> 508.namd_r                0.34%   0.34%
>>>>> 510.parest_r              0.17%   0.25%
>>>>> 511.povray_r              56.57%  57.27%
>>>>> 519.lbm_r         0.00%   0.00%
>>>>> 521.wrf_r                  -0.28% -0.37%
>>>>> 526.blender_r             16.96%  17.71%
>>>>> 527.cam4_r                0.70%   0.53%
>>>>> 538.imagick_r             2.40%   2.40%
>>>>> 544.nab_r         0.00%   -0.65%
>>>>> 
>>>>> avg                               5.17%   5.37%
>>>>> 
>>>>> From the above data, we can see that in general, the runtime performance 
>>>>> slowdown for 
>>>>> implementation A and D are similar for individual benchmarks.
>>>>> 
>>>>> There are several benchmarks that have significant slowdown with the new 
>>>>> added initialization for both
>>>>> A and D, for example, 511.povray_r, 526.blender_, and 531.deepsjeng_r, I 
>>>>> will try to study a little bit
>>>>> more on what kind of new initializations introduced such slowdown. 
>>>>> 
>>>>> From the current study so far, I think that approach D should be good 
>>>>> enough for our final implementation. 
>>>>> So, I will try to finish approach D with the following remaining work
>>>>> 
>>>>>    ** complete the implementation of -ftrivial-auto-var-init=pattern;
>>>>>    ** complete the implementation of uninitialized warnings maintenance 
>>>>> work for D. 
>>>>> 
>>>>> 
>>>>> Let me know if you have any comments and suggestions on my current and 
>>>>> future work.
>>>>> 
>>>>> Thanks a lot for your help.
>>>>> 
>>>>> Qing
>>>>> 
>>>>>> On Dec 9, 2020, at 10:18 AM, Qing Zhao via Gcc-patches 
>>>>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>>> 
>>>>>> The following are the approaches I will implement and compare:
>>>>>> 
>>>>>> Our final goal is to keep the uninitialized warning and minimize the 
>>>>>> run-time performance cost.
>>>>>> 
>>>>>> A. Adding real initialization during gimplification, not maintain the 
>>>>>> uninitialized warnings.
>>>>>> B. Adding real initialization during gimplification, marking them with 
>>>>>> “artificial_init”. 
>>>>>>  Adjusting uninitialized pass, maintaining the annotation, making sure 
>>>>>> the real init not
>>>>>>  Deleted from the fake init. 
>>>>>> C.  Marking the DECL for an uninitialized auto variable as 
>>>>>> “no_explicit_init” during gimplification,
>>>>>>   maintain this “no_explicit_init” bit till after 
>>>>>> pass_late_warn_uninitialized, or till pass_expand, 
>>>>>>   add real initialization for all DECLs that are marked with 
>>>>>> “no_explicit_init”.
>>>>>> D. Adding .DEFFERED_INIT during gimplification, expand the 
>>>>>> .DEFFERED_INIT during expand to
>>>>>>  real initialization. Adjusting uninitialized pass with the new refs 
>>>>>> with “.DEFFERED_INIT”.
>>>>>> 
>>>>>> 
>>>>>> In the above, approach A will be the one that have the minimum run-time 
>>>>>> cost, will be the base for the performance
>>>>>> comparison. 
>>>>>> 
>>>>>> I will implement approach D then, this one is expected to have the most 
>>>>>> run-time overhead among the above list, but
>>>>>> Implementation should be the cleanest among B, C, D. Let’s see how much 
>>>>>> more performance overhead this approach
>>>>>> will be. If the data is good, maybe we can avoid the effort to implement 
>>>>>> B, and C. 
>>>>>> 
>>>>>> If the performance of D is not good, I will implement B or C at that 
>>>>>> time.
>>>>>> 
>>>>>> Let me know if you have any comment or suggestions.
>>>>>> 
>>>>>> Thanks.
>>>>>> 
>>>>>> Qing
>>>>> 
>>>> 
>>>> 
>>> 
>>> -- 
>>> Richard Biener <rguent...@suse.de <mailto:rguent...@suse.de> 
>>> <mailto:rguent...@suse.de <mailto:rguent...@suse.de>>>
>>> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
>>> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
>> 
>> 
> 
> -- 
> Richard Biener <rguent...@suse.de <mailto:rguent...@suse.de>>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
Re: The performance data for two different implementation of new security feature -ftrivial-auto-var-init

Reply via email to