> Am 26.10.2023 um 16:58 schrieb Qing Zhao <qing.z...@oracle.com>:
> 
> 
> 
>> On Oct 26, 2023, at 4:56 AM, Richard Biener <richard.guent...@gmail.com> 
>> wrote:
>> 
>>> On Thu, Oct 26, 2023 at 7:22 AM Jakub Jelinek <ja...@redhat.com> wrote:
>>> 
>>> On Wed, Oct 25, 2023 at 07:03:43PM +0000, Qing Zhao wrote:
>>>> For the code generation impact:
>>>> 
>>>> turning the original  x.buf
>>>> to a builtin function call
>>>> __builtin_with_access_and_size(x,buf, x.L,-1)
>>>> 
>>>> might inhibit some optimizations from happening before the builtin is
>>>> evaluated into object size info (phase  .objsz1).  I guess there might be
>>>> some performance impact.
>>>> 
>>>> However, if we mark this builtin as PURE, NOTRROW, etc, then the negative
>>>> performance impact will be reduced to minimum?
>>> 
>>> You can't drop it during objsz1 pass though, otherwise __bdos wouldn't
>>> be able to figure out the dynamic sizes in case of normal (non-early)
>>> inlining - caller takes address of a counted_by array, passes it down
>>> to callee which is only inlined late and uses __bdos, or callee takes 
>>> address
>>> and returns it and caller uses __bdos, etc. - so it would need to be objsz2.
>>> 
>>> And while the builtin (or if it is an internal detail rather than user
>>> accessible builtin an internal function) could be even const/nothrow/leaf if
>>> the arguments contain the loads from the structure 2 fields, I'm afraid it
>>> will still have huge code generation impact, prevent tons of pre-IPA
>>> optimizations.  And it will need some work to handle it properly during
>>> inlining heuristics, because in GIMPLE the COMPONENT_REF loads aren't gimple
>>> values, so it wouldn't be just the builtin/internal-fn call to be ignored,
>>> but also the count load from memory.
>> 
>> I think we want to track the value, not the "memory" in the builtin call,
>> so GIMPLE would be
>> 
>> _1 = x.L;
>> .. = __builtin_with_access_and_size (&x.buf, _1, -1);
> 
> Before adding the __builtin_with_access_and_size, the code is:
> 
> &x.buf
> 
> After inserting the built-in, it becomes:
> 
> _1 = x.L;
> __builtin_with_access_and_size (&x.buf, _1, -1).
> 
> 
> So, the # of total instructions, the # of LOADs, and the # of calls will all 
> be increased.
> There will be impact to the inlining decision definitely.

Note we have to make sure, if x is a pointer and we want to instrument &x->buf 
that we
Can dereference x.  Possibly doing

_1 = x ? x->Len : -1;

I’m not sure the C standard makes accessing x->Len unconditionally not 
undefined behavior when &x->buf is computed.  Definitely it’s a violation of 
the abstract machine of Len is volatile qualified (but we can reject such 
counted_by or instantiations as volatile qualified types).

Richard 

> 
>> 
>> also please make sure to use an internal function for
>> __builtin_with_access_and_size,
>> I don't think we want to expose this to users - it's an implementation 
>> detail.
> 
> Okay, will define it as an internal function (add it to internal-fn.def). -:)
> 
> Qing
>> 
>> Richard.
>> 
>>> 
>>>       Jakub
>>> 
> 

Reply via email to