Hi, 

I wrote a summary based on our extensive discussion, hopefully this can be 
served as an informal proposal. 

Please take a look at it and let me know any comment or suggestion.

There are some (???) in the section 3.2 and 3.6, those are my questions seeking 
for help.  -:)

Thanks again for all the help.

Qing.

========================================================
Represent the missing dependence for the "counted_by" attribute and its 
consumers 

Qing Zhao

10/30/2023
==============================================

The whole discussion is at:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633783.html

1. The problem

There is a data dependency between the size assignment and the implicit use of 
the size information in the __builtin_dynamic_object_size that is missing in 
the IL (line 11 and line 13 in the below example). Such information missing 
will result incorrect code reordering and other code transformations. 

  1 struct A
  2 {
  3  size_t size;
  4  char buf[] __attribute__((counted_by(size)));
  5 };
  6 
  7 size_t 
  8 foo (size_t sz)
  9 {
 10  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
 11  obj->size = sz;
 12  obj->buf[0] = 2;
 13  return __builtin_dynamic_object_size (obj->buf, 1);
 14 }
  
Please see a more complicate example in the Appendex 1.

We need to represent such data dependency correctly in the IL. 

2. The solution:

2.1 Summary

* Add a new internal function "ACCESS_WITH_SIZE" to carry the size information 
for every FAM field access;
* In C FE, Replace every FAM field access whose TYPE has the "counted_by" 
attribute with the new internal function "ACCESS_WITH_SIZE";
* In every consumer of the size information, for example, BDOS or array bound 
sanitizer, query the size information or ACCESS_MODE information from the new 
internal function;
* When the size information and the "ACCESS_MODE" information are not used 
anymore, possibly at the 2nd object size phase, replace the internal function 
with the actual FAM field access; 
* Some adjustment to inlining heuristic and some SSA passes to mitigate the 
impact to the optimizer and code generation. 

2.2 The new internal function 

  .ACCESS_WITH_SIZE (PTR, SIZE, ACCESS_MODE)

INTERNAL_FN (ACCESS_WITH_SIZE, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)

which returns the "PTR" same as the 1st argument;

1st argument "PTR": Pointer to the object;
2nd argument "SIZE": The size of the pointed object, 
  if the pointee of the "PTR" has a
    * real type, it's the number of the elements of the type;
    * void type, it's the number of bytes; 
3rd argument "ACCESS_MODE": 
  -1: Unknown access semantics
   0: none
   1: read_only
   2: write_only
   3: read_write

NOTEs, 
  A. This new internal function is intended for a more general use from all the 
3 attributes, "access", "alloc_size", and the new "counted_by", to encode the 
"size" and "access_mode" information to the corresponding pointer. (in order to 
resolve PR96503, etc. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503)
  B. For "counted_by" and "alloc_size" attributes, the 3rd argument will be -1. 
  
  C. In this wrieup, we focus on the implementation details for the 
"counted_by" attribute. However, this function should be ready to be used by 
"access" and "alloc_size" without issue. 

2.3 A new semantic requirement in the user documentation of "counted_by"

For the following structure including a FAM with a counted_by attribute:

  struct A
  {
   size_t size;
   char buf[] __attribute__((counted_by(size)));
  };

for any object with such type:

  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));

The setting to the size field should be done before the first reference to the 
FAM field.

Such requirement to the user will guarantee that the first reference to the FAM 
knows the size of the FAM.  

We need to add this additional requirement to the user document.

2.4 Replace FAM field accesses with the new function ACCESS_WITH_SIZE

In C FE:

for every reference to a FAM, for example, "obj->buf" in the small example,
  check whether the corresponding FIELD_DECL has a "counted_by" attribute?
  if YES, replace the reference to "obj->buf" with a call to
      .ACCESS_WITH_SIZE (obj->buf, obj->size, -1); 

2.5 Query the size info 

There are multiple consumers of the size info (and ACCESS_MODE info):

  * __builtin_dynamic_object_size;
  * array bound sanitizer;

in these consumers, get the size info from the 2nd argument of the call to
ACCESS_WITH_SIZE (PTR, SIZE, -1)

2.6 Eliminate the internal function when not useful anymore

After the last consumer of the size information in the ACCESS_WITH_SIZE, We 
should replace the internal call with its first argument.

Do it in the 2nd object size phase. 

2.7 Adjustment to inlining heuristic and other IPA analysis

the FE changes:

obj->buf

to

_1 = obj->buf;
_2 = obj->size;
.ACCESS_WITH_SIZE (_1, _2, -1)

there are major two changes:

  A. the # of LOADs, the # of Insns of the routine will be increased,
  B. additional calls to internal routines.

As a result:
  Inlining decision and some IPA analysis might be changed due to these 
changes. 

We need to adjust inlining heurisic and IPA analysis properly.

3. The patch sets:

We will provide the following patches:

  3.1 Provide counted_by attribute to flexible array member field;
      which includes:
      * "counted_by" attribute documentation;  
      * C FE handling of the new attribute; 
        syntax checking, error reporting;
      * testing cases;

  3.2 Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE.  
      which includes:
      * The definition of the new internal function .ACCESS_WITH_SIZE in 
internal-fn.def. 
      * C FE converts every access to a FAM with "counted_by" attribute to a 
call to the internal function .ACCESS_WITH_SIZE. (where in the FE, which 
routine should look at???)
      * Convert every call to .ACCESS_WITH_SIZE to its first argument. where 
should we do this? in the end of the 2nd object size phase or in the expand 
phase? (I prefer to do this in the end of the 2nd object size phase, 
comments???)
      * adjust inlining heuristic for the additional call and loads (???).
      * Other phases need to be adjusted (???).
      * verify a call to .ACCESS_WITH_SIZE in tree-cfg.cc.
      * provide utility routines to get the size or ACCESS_MODE from the call 
to .ACCESS_WITH_SIZE. 

  3.3 Use the .ACCESS_WITH_SIZE in builtin object size (sub-object only)
      which includes: 
      * use the size info of the .ACCESS_WITH_SIZE for sub-object.
      * testing cases.

  3.4 Use the .ACCESS_WITH_SIZE in bound sanitizer
      which includes:
      * use the size info of the .ACCESS_WITH_SIZE for bound sanitizer.
      * testing cases.

  3.5 Improve __bdos to use the counted_by info in whole-object size for the 
structure with FAM. 
  3.6 Support for dynamic array of FMA (???)  
      https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634568.html

  3.7 Emit warnings when the user breaks the requirments for the new counted_by 
attribute
      compilation time: -Wcounted-by
      run time: -fsanitizer=counted-by
      
      * The setting to the size field should be done before the first reference 
to the FAM field. 
      * the array has at least # of elements specified by the size field all 
the time during the program.  
 

4. Future work
  
  4.1 Use the new .ACCESS_WITH_SIZE for other relative attributes, for example, 
"access" and "alloc_size", to resolve the known issues for them:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503

  4.2 Add "counted_by" attribute to general pointers as proposed by the 
-fbounds-safety proposal to LLVM from Apple 
(https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854).
  

the .ACCESS_WITH_SIZE approach should be adopted naturally. 


Appendex 1. A more complicated example: 

  1 #include <malloc.h>
  2 struct A
  3 {
  4  size_t size;
  5  char buf[] __attribute__((counted_by(size)));
  6 };
  7 
  8 static size_t
  9 get_size_from (void *ptr)
 10 {
 11  return __builtin_dynamic_object_size (ptr, 1);
 12 }
 13 
 14 void
 15 foo (size_t sz)
 16 {
 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
 18  obj->size = sz;
 19  obj->buf[0] = 2;
 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
 21  return;
 22 }
 23 
 24 int main ()
 25 {
 26  foo (20);
 27  return 0;
 28 }

In this example, the instantiation of the object at line 17 is not in the same 
routine as the _BDOS call at line 11. 

Appendex 2. Other approaches that have been discussed:

A.  Add an additional argument, the size parameter, to the call to __bdos, 
      A.1, during FE;
      A.2, during gimplification phase;
B.  Encode the implicit USE in the type of size, to make the size as “volatile”;
C.  Encode the implicit USE in the type of buf, then update the optimization 
passes to use this implicit USE encoded in the type of buf.

** Approach A (both A.1 and A.2) will not work if the object instantiation is 
not in the same routine as the __bdos call as shown in the above example;
** Approach B will have a bigger performance impact due to the "volatile" will 
inhibit a lot of optimizations; 
** Approach C will involve a lot of changes in GCC, and also not very necessary 
since the ONLY implicit use of the size parameter is in the __bdos call when 
__bdos can see the real object.

Reply via email to