> On Oct 31, 2023, at 1:35 PM, Siddhesh Poyarekar <siddh...@gotplt.org> wrote:
> 
> On 2023-10-31 12:26, Qing Zhao wrote:
>> Hi,
>> I wrote a summary based on our extensive discussion, hopefully this can be 
>> served as an informal proposal.
>> Please take a look at it and let me know any comment or suggestion.
>> There are some (???) in the section 3.2 and 3.6, those are my questions 
>> seeking for help.  -:)
>> Thanks again for all the help.
>> Qing.
>> ========================================================
>> Represent the missing dependence for the "counted_by" attribute and its 
>> consumers
>> Qing Zhao
>> 10/30/2023
>> ==============================================
>> The whole discussion is at:
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633783.html
>> 1. The problem
>> There is a data dependency between the size assignment and the implicit use 
>> of the size information in the __builtin_dynamic_object_size that is missing 
>> in the IL (line 11 and line 13 in the below example). Such information 
>> missing will result incorrect code reordering and other code transformations.
>>   1 struct A
>>   2 {
>>   3  size_t size;
>>   4  char buf[] __attribute__((counted_by(size)));
>>   5 };
>>   6
>>   7 size_t
>>   8 foo (size_t sz)
>>   9 {
>>  10  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>>  11  obj->size = sz;
>>  12  obj->buf[0] = 2;
>>  13  return __builtin_dynamic_object_size (obj->buf, 1);
>>  14 }
>>   Please see a more complicate example in the Appendex 1.
>> We need to represent such data dependency correctly in the IL.
>> 2. The solution:
>> 2.1 Summary
>> * Add a new internal function "ACCESS_WITH_SIZE" to carry the size 
>> information for every FAM field access;
>> * In C FE, Replace every FAM field access whose TYPE has the "counted_by" 
>> attribute with the new internal function "ACCESS_WITH_SIZE";
>> * In every consumer of the size information, for example, BDOS or array 
>> bound sanitizer, query the size information or ACCESS_MODE information from 
>> the new internal function;
>> * When the size information and the "ACCESS_MODE" information are not used 
>> anymore, possibly at the 2nd object size phase, replace the internal 
>> function with the actual FAM field access;
>> * Some adjustment to inlining heuristic and some SSA passes to mitigate the 
>> impact to the optimizer and code generation.
>> 2.2 The new internal function
>>   .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)
>> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
>> which returns the "PTR" same as the 1st argument;
>> 1st argument "PTR": Pointer to the object;
>> 2nd argument "SIZE": The size of the pointed object,
>>   if the pointee of the "PTR" has a
>>     * real type, it's the number of the elements of the type;
>>     * void type, it's the number of bytes;
>> 3rd argument "ACCESS_MODE":
>>   -1: Unknown access semantics
>>    0: none
>>    1: read_only
>>    2: write_only
>>    3: read_write
>> NOTEs,
>>   A. This new internal function is intended for a more general use from all 
>> the 3 attributes, "access", "alloc_size", and the new "counted_by", to 
>> encode the "size" and "access_mode" information to the corresponding 
>> pointer. (in order to resolve PR96503, etc. 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503)
>>   B. For "counted_by" and "alloc_size" attributes, the 3rd argument will be 
>> -1.
>>   C. In this wrieup, we focus on the implementation details for the 
>> "counted_by" attribute. However, this function should be ready to be used by 
>> "access" and "alloc_size" without issue.
>> 2.3 A new semantic requirement in the user documentation of "counted_by"
>> For the following structure including a FAM with a counted_by attribute:
>>   struct A
>>   {
>>    size_t size;
>>    char buf[] __attribute__((counted_by(size)));
>>   };
>> for any object with such type:
>>   struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>> The setting to the size field should be done before the first reference to 
>> the FAM field.
> 
> A more flexible specification could be stating that validation for a 
> reference to the FAM field will use the latest value assigned to the size 
> field before that reference.  That will allow for situations like:
> 
>  o->size = val1;
>  deref (o->buf);
>  o->size = val2;
> 
> making it clear that deref will see val1 and not val2.

Good point! Yes, with the current design, this is reasonable. 
Will update the proposal with this.
> 
>> Such requirement to the user will guarantee that the first reference to the 
>> FAM knows the size of the FAM.
>> We need to add this additional requirement to the user document.
>> 2.4 Replace FAM field accesses with the new function ACCESS_WITH_SIZE
>> In C FE:
>> for every reference to a FAM, for example, "obj->buf" in the small example,
>>   check whether the corresponding FIELD_DECL has a "counted_by" attribute?
>>   if YES, replace the reference to "obj->buf" with a call to
>>       .ACCESS_WITH_SIZE (obj->buf, obj->size, -1);
>> 2.5 Query the size info
>> There are multiple consumers of the size info (and ACCESS_MODE info):
>>   * __builtin_dynamic_object_size;
>>   * array bound sanitizer;
>> in these consumers, get the size info from the 2nd argument of the call to
>> ACCESS_WITH_SIZE (PTR, SIZE, -1)
>> 2.6 Eliminate the internal function when not useful anymore
>> After the last consumer of the size information in the ACCESS_WITH_SIZE, We 
>> should replace the internal call with its first argument.
>> Do it in the 2nd object size phase.
>> 2.7 Adjustment to inlining heuristic and other IPA analysis
>> the FE changes:
>> obj->buf
>> to
>> _1 = obj->buf;
>> _2 = obj->size;
>> .ACCESS_WITH_SIZE (_1, _2, -1)
>> there are major two changes:
>>   A. the # of LOADs, the # of Insns of the routine will be increased,
>>   B. additional calls to internal routines.
>> As a result:
>>   Inlining decision and some IPA analysis might be changed due to these 
>> changes.
>> We need to adjust inlining heurisic and IPA analysis properly.
>> 3. The patch sets:
>> We will provide the following patches:
>>   3.1 Provide counted_by attribute to flexible array member field;
>>       which includes:
>>       * "counted_by" attribute documentation;
>>       * C FE handling of the new attribute;
>>         syntax checking, error reporting;
>>       * testing cases;
>>   3.2 Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE.
>>       which includes:
>>       * The definition of the new internal function .ACCESS_WITH_SIZE in 
>> internal-fn.def.
>>       * C FE converts every access to a FAM with "counted_by" attribute to a 
>> call to the internal function .ACCESS_WITH_SIZE. (where in the FE, which 
>> routine should look at???)
>>       * Convert every call to .ACCESS_WITH_SIZE to its first argument. where 
>> should we do this? in the end of the 2nd object size phase or in the expand 
>> phase? (I prefer to do this in the end of the 2nd object size phase, 
>> comments???)
> 
> I'd say be like __bos here, i.e. replace instances in the objsz2 pass and 
> also as a fallback, implement expand_builtin_access_with_size (like 
> expand_builtin_object_size) to replace all remaining instances of the builtin 
> with the first arg.

Okay. This is reasonable too.
> 
>>       * adjust inlining heuristic for the additional call and loads (???).
>>       * Other phases need to be adjusted (???).
> 
> Do we need this?  I thought expressing the dependency with the builtin ought 
> to be sufficient.

Yes, I think we need this. Not for correctness, but for mitigating the 
performance impact due to the possible inlining decision change or other IPA 
analysis impacted by the new function calls.
Please see Jakub’s comment at:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634343.html

> 
>>       * verify a call to .ACCESS_WITH_SIZE in tree-cfg.cc.
>>       * provide utility routines to get the size or ACCESS_MODE from the 
>> call to .ACCESS_WITH_SIZE.
>>   3.3 Use the .ACCESS_WITH_SIZE in builtin object size (sub-object only)
>>       which includes:
>>       * use the size info of the .ACCESS_WITH_SIZE for sub-object.
>>       * testing cases.
>>   3.4 Use the .ACCESS_WITH_SIZE in bound sanitizer
>>       which includes:
>>       * use the size info of the .ACCESS_WITH_SIZE for bound sanitizer.
>>       * testing cases.
>>   3.5 Improve __bdos to use the counted_by info in whole-object size for the 
>> structure with FAM.
>>   3.6 Support for dynamic array of FMA (???)
>>       https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634568.html
> 
> I'm not entirely convinced that it's an issue.  __bdos will just look at 
> __counted_by__ and use its most recent value (using the above user 
> specification) and with that it's just a matter of application preference.  
> If they want dynamic array behaviour, they could just annotate the array 
> count member with __counted_by__ instead of the array capacity member.

This makes sense.  Yes, will update the proposal with this.

Thanks.

Qing
> 
> Thanks,
> Sid
> 
> 
>>   3.7 Emit warnings when the user breaks the requirments for the new 
>> counted_by attribute
>>       compilation time: -Wcounted-by
>>       run time: -fsanitizer=counted-by
>>              * The setting to the size field should be done before the first 
>> reference to the FAM field.
>>       * the array has at least # of elements specified by the size field all 
>> the time during the program.
>>  4. Future work
>>      4.1 Use the new .ACCESS_WITH_SIZE for other relative attributes, for 
>> example, "access" and "alloc_size", to resolve the known issues for them:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503
>>   4.2 Add "counted_by" attribute to general pointers as proposed by the 
>> -fbounds-safety proposal to LLVM from Apple 
>> (https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854).
>> the .ACCESS_WITH_SIZE approach should be adopted naturally.
>> Appendex 1. A more complicated example:
>>   1 #include <malloc.h>
>>   2 struct A
>>   3 {
>>   4  size_t size;
>>   5  char buf[] __attribute__((counted_by(size)));
>>   6 };
>>   7
>>   8 static size_t
>>   9 get_size_from (void *ptr)
>>  10 {
>>  11  return __builtin_dynamic_object_size (ptr, 1);
>>  12 }
>>  13
>>  14 void
>>  15 foo (size_t sz)
>>  16 {
>>  17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>>  18  obj->size = sz;
>>  19  obj->buf[0] = 2;
>>  20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>>  21  return;
>>  22 }
>>  23
>>  24 int main ()
>>  25 {
>>  26  foo (20);
>>  27  return 0;
>>  28 }
>> In this example, the instantiation of the object at line 17 is not in the 
>> same routine as the _BDOS call at line 11.
>> Appendex 2. Other approaches that have been discussed:
>> A.  Add an additional argument, the size parameter, to the call to __bdos,
>>       A.1, during FE;
>>       A.2, during gimplification phase;
>> B.  Encode the implicit USE in the type of size, to make the size as 
>> “volatile”;
>> C.  Encode the implicit USE in the type of buf, then update the optimization 
>> passes to use this implicit USE encoded in the type of buf.
>> ** Approach A (both A.1 and A.2) will not work if the object instantiation 
>> is not in the same routine as the __bdos call as shown in the above example;
>> ** Approach B will have a bigger performance impact due to the "volatile" 
>> will inhibit a lot of optimizations;
>> ** Approach C will involve a lot of changes in GCC, and also not very 
>> necessary since the ONLY implicit use of the size parameter is in the __bdos 
>> call when __bdos can see the real object.

Reply via email to