PING^3

On 10/23/18 11:02 AM, Martin Liška wrote:
> PING^2
> 
> On 10/9/18 10:29 AM, Martin Liška wrote:
>> PING^1
>>
>> On 9/26/18 11:33 AM, Martin Liška wrote:
>>> On 9/25/18 5:53 PM, Jakub Jelinek wrote:
>>>> On Tue, Sep 25, 2018 at 05:26:44PM +0200, Martin Liška wrote:
>>>>> The only missing piece is how to implement asan_emit_redzone_payload more 
>>>>> smart.
>>>>> It means doing memory stores with 8,4,2,1 sizes in order to reduce # of 
>>>>> insns.
>>>>> Do we have somewhere a similar code?
>>>>
>>>> Yeah, that is a very important optimization.  I wasn't using DImode because
>>>> at least on x86_64 64-bit constants are quite expensive and on several 
>>>> other
>>>> targets even more so, so SImode was a compromise to get size of the 
>>>> prologue
>>>> under control and not very slow.  What I think we want is figure out ranges
>>>
>>> Ah, some time ago, I remember you mentioned the 64-bit constants are 
>>> expensive
>>> (even on x86_64). Btw. it's what clang used for the red zone 
>>> instrumentation.
>>>
>>>> of shadow bytes we want to initialize and the values we want to store 
>>>> there,
>>>> perhaps take also into account strict alignment vs. non-strict alignment,
>>>> and perform kind of store merging for it.  Given that 2 shadow bytes would
>>>> be only used for the very small variables (<=4 bytes in size, so <= 0.5
>>>> bytes of shadow), we'd just need a way to remember the 2 shadow bytes 
>>>> across
>>>> handling adjacent vars and store it together.
>>>
>>> Agree, it's implemented in next version of patch.
>>>
>>>>
>>>> I think we want to introduce some define for minimum red zone size and use
>>>> it instead of the granularity (granularity is 8 bytes, but minimum red zone
>>>> size if we count into it also the very small variable size is 16 bytes).
>>>>
>>>>> --- a/gcc/asan.h
>>>>> +++ b/gcc/asan.h
>>>>> @@ -102,6 +102,26 @@ asan_red_zone_size (unsigned int size)
>>>>>    return c ? 2 * ASAN_RED_ZONE_SIZE - c : ASAN_RED_ZONE_SIZE;
>>>>>  }
>>>>>  
>>>>> +/* Return how much a stack variable occupy on a stack
>>>>> +   including a space for redzone.  */
>>>>> +
>>>>> +static inline unsigned int
>>>>> +asan_var_and_redzone_size (unsigned int size)
>>>>
>>>> The argument needs to be UHWI, otherwise you do a wrong thing for
>>>> say 4GB + 4 bytes long variable.  Ditto the result.
>>>>
>>>>> +{
>>>>> +  if (size <= 4)
>>>>> +    return 16;
>>>>> +  else if (size <= 16)
>>>>> +    return 32;
>>>>> +  else if (size <= 128)
>>>>> +    return 32 + size;
>>>>> +  else if (size <= 512)
>>>>> +    return 64 + size;
>>>>> +  else if (size <= 4096)
>>>>> +    return 128 + size;
>>>>> +  else
>>>>> +    return 256 + size;
>>>>
>>>> I'd prefer size + const instead of const + size operand order.
>>>>
>>>>> @@ -1125,13 +1125,13 @@ expand_stack_vars (bool (*pred) (size_t), struct 
>>>>> stack_vars_data *data)
>>>>>         && stack_vars[i].size.is_constant ())
>>>>>       {
>>>>>         prev_offset = align_base (prev_offset,
>>>>> -                                 MAX (alignb, ASAN_RED_ZONE_SIZE),
>>>>> +                                 MAX (alignb, ASAN_SHADOW_GRANULARITY),
>>>>
>>>> Use that ASAN_MIN_RED_ZONE_SIZE (16) here.
>>>>
>>>>>                                   !FRAME_GROWS_DOWNWARD);
>>>>>         tree repr_decl = NULL_TREE;
>>>>> +       poly_uint64 size =  asan_var_and_redzone_size 
>>>>> (stack_vars[i].size.to_constant ());
>>>>
>>>> Too long line.  Two spaces instead of one.  Why poly_uint64?
>>>> Plus, perhaps if data->asan_vec is empty (i.e. when assigning the topmost
>>>> automatic variable in a frame), we should ensure that size is at least
>>>> 2 * ASAN_RED_ZONE_SIZE (or just 1 * ASAN_RED_ZONE_SIZE). 
>>>>
>>>>>         offset
>>>>> -         = alloc_stack_frame_space (stack_vars[i].size
>>>>> -                                    + ASAN_RED_ZONE_SIZE,
>>>>> -                                    MAX (alignb, ASAN_RED_ZONE_SIZE));
>>>>> +         = alloc_stack_frame_space (size,
>>>>> +                                    MAX (alignb, 
>>>>> ASAN_SHADOW_GRANULARITY));
>>>>
>>>> Again, too long line and we want 16 instead of 8 here too.
>>>>>  
>>>>>         data->asan_vec.safe_push (prev_offset);
>>>>>         /* Allocating a constant amount of space from a constant
>>>>> @@ -2254,7 +2254,7 @@ expand_used_vars (void)
>>>>>                    & ~(data.asan_alignb - HOST_WIDE_INT_1)) - sz;
>>>>>     /* Allocating a constant amount of space from a constant
>>>>>        starting offset must give a constant result.  */
>>>>> -   offset = (alloc_stack_frame_space (redzonesz, ASAN_RED_ZONE_SIZE)
>>>>> +   offset = (alloc_stack_frame_space (redzonesz, ASAN_SHADOW_GRANULARITY)
>>>>
>>>> and here too.
>>>>
>>>>    Jakub
>>>>
>>>
>>> The rest is also implemented as requested. I'm testing Linux kernel now, 
>>> will send
>>> stats to the PR created for it.
>>>
>>> Patch survives testing on x86_64-linux-gnu.
>>>
>>> Martin
>>>
>>
> 

Reply via email to