Hi,

This is the 9th version of the patch set to extend "counted_by" attribute
 to pointer fields of structures, which fixes PR120929:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120929

The 7th version of the patch has been committed into trunk, but triggered
PR120929. I reverted the patches from trunk due to PR120929.  

In order to fix PR120929, we agreed on the following solution:

for a pointer field with counted_by attribute:

struct S {
  int n;
  int *p __attribute__((counted_by(n)));
} *f;

when generating call to .ACCESS_WITH_SIZE for f->p, instead of generating
 *.ACCESS_WITH_SIZE (&f->p, &f->n,...)

in the 7th patch,

We should generate
 .ACCESS_WITH_SIZE (f->p, &f->n,...)

i.e., 
the return type and the type of the first argument of the call is the 
   original pointer type in this version,
   instead of the pointer to the original pointer type in the 7th version;

However, this code generation might bring undefined behavior into the
applicaiton if the call to .ACCESS_WITH_SIZE is generated for a pointer
field reference when this refernece is written to.

For example:

f->p = malloc (size);

***** the IL for the above is:

  tmp1 = f->p;
  tmp2 = &f->n;
  tmp3 = .ACCESS_WITH_SIZE (tmp1, tmp2, ...);
  tmp4 = malloc (size);
  tmp3 = tmp4;

In the above, in order to generate a call to .ACCESS_WITH_SIZE for the pointer
reference f->p,  the new GIMPLE tmp1 = f->p is necessary to pass the value of
the pointer f->p to the call to .ACCESS_WITH_SIZE. However, this new GIMPLE is
the one that brings UB into the application since the value of f->p is not
initialized yet when it is assigned to "tmp1".

the above IL will be expanded to the following when .ACCESS_WITH_SIZE is 
expanded
to its first argument:

  tmp1 = f->p;
  tmp2 = &f->n;
  tmp3 = tmp1;
  tmp4 = malloc (size);
  tmp3 = tmp4;

the final optimized IL will be:

  tmp3 = f->p;
  tmp3 = malloc (size);;

As a result, the f->p will NOT be set correctly to the pointer
returned by malloc (size).

Due to this potential issue, We will need to selectively generate the call to
.ACCESS_WITH_SIZE for f->p according to whether it's a read or a write.

We will only generate call to .ACCESS_WITH_SIZE for f->p when it's a read in
C FE.

In addition to the above major changes, other minor changes include:

A. add the testing case pr120929.c, and some more testing cases.
B. delete the change in gcc/tree-object-size.cc, no any change is
   needed now in middle-end;
C. adjust the patch for using the counted_by attribute in array bound checker
   per the new format of .ACCESS_WITH_SIZE.
D. add a new 4th patch: 
    Generate a call to a .ACCESS_WITH_SIZE for a FAM with counted_by attribute
    only when it's read from.
 
In patch #1, the changes in c-attribs.cc, c-decl.cc and doc/extend.texi are
exactly the same as the 7th version, which has been reviewed and approved by
Joseph. The other changes in c-tree.h, c-parser.cc, c-typeck.cc are new changes
for the new code generation for call to .ACCESS_WITH_SIZE for pointers. Also,
I added two more new testing cases, pointer-counted-by-8.c and
pointer-counted-by-9.c.

In patch #2, compared to 7th version, the change in gcc/tree-object-size.cc
is completely deleted. no need change middle end to use the counted_by
of pointers in tree-object-size.cc. except the new testing case pr120929.c,
all the other testing cases are kept the same as tghe 7th version.

In Patch #3, compared to the 7th version, the following is the diff:

=============================
>From ff3dd7ebc0e41dff13ce47aefdcef609aac541fa Mon Sep 17 00:00:00 2001
From: Qing Zhao <qing.z...@oracle.com>
Date: Fri, 11 Jul 2025 14:54:21 +0000
Subject: [PATCH] fix ubsan per the new design

---
 gcc/c-family/c-gimplify.cc |  3 +--
 gcc/c-family/c-ubsan.cc    | 27 +++++++++++----------------
 2 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/gcc/c-family/c-gimplify.cc b/gcc/c-family/c-gimplify.cc
index e905059708f7..131eca8297f8 100644
--- a/gcc/c-family/c-gimplify.cc
+++ b/gcc/c-family/c-gimplify.cc
@@ -74,8 +74,7 @@ static bool
 is_address_with_access_with_size (tree tp)
 {
   if (TREE_CODE (tp) == POINTER_PLUS_EXPR
-      && (TREE_CODE (TREE_OPERAND (tp, 0)) == INDIRECT_REF)
-      && (is_access_with_size_p (TREE_OPERAND (TREE_OPERAND (tp, 0), 0))))
+      && is_access_with_size_p (TREE_OPERAND (tp, 0)))
        return true;
   return false;
 }
diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 8d44b4ba39b4..97bb5459fbfc 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -550,7 +550,7 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
*index,
 
 
 /* Instrument array bounds for the pointer array address which is
-   an INDIRECT_REF to the call to .ACCESS_WITH_SIZE.  We create special
+   a call to .ACCESS_WITH_SIZE.  We create special
    builtin, that gets expanded in the sanopt pass, and make an array
    dimention of it.  POINTER_ADDR is the pointer array's base address.
    *INDEX is an index to the array.
@@ -563,8 +563,7 @@ ubsan_instrument_bounds_pointer_address (location_t loc, 
tree pointer_addr,
                                         tree *index,
                                         bool ignore_off_by_one)
 {
-  gcc_assert (TREE_CODE (pointer_addr) == INDIRECT_REF);
-  tree call = TREE_OPERAND (pointer_addr, 0);
+  tree call = pointer_addr;
   if (!is_access_with_size_p (call))
     return NULL_TREE;
   tree bound = get_bound_from_access_with_size (call);
@@ -593,7 +592,7 @@ ubsan_instrument_bounds_pointer_address (location_t loc, 
tree pointer_addr,
   tree itype = build_range_type (sizetype, size_zero_node, NULL_TREE);
   /* The array's element type can be get from the return type of the call to
      .ACCESS_WITH_SIZE.  */
-  tree element_type = TREE_TYPE (TREE_TYPE (TREE_TYPE (call)));
+  tree element_type = TREE_TYPE (TREE_TYPE (call));
   tree array_type = build_array_type (element_type, itype);
   /* Create a "(T *) 0" tree node to describe the array type.  */
   tree zero_with_type = build_int_cst (build_pointer_type (array_type), 0);
@@ -702,7 +701,7 @@ get_index_from_offset (tree offset, tree *index_p,
 }
 
 /* For an pointer + offset computation expression, such as,
-   *.ACCESS_WITH_SIZE (p->c, &p->b, 1, 0, -1, 0B)
+   .ACCESS_WITH_SIZE (p->c, &p->b, 1, 0, -1, 0B)
     + (sizetype) ((long unsigned int) index * 4
    Return the index of this pointer array reference,
    set the parent tree of INDEX to *INDEX_P.
@@ -714,15 +713,13 @@ get_index_from_pointer_addr_expr (tree pointer, tree 
*index_p, int *index_n)
 {
   *index_p = NULL_TREE;
   *index_n = -1;
-  if (TREE_CODE (TREE_OPERAND (pointer, 0)) != INDIRECT_REF)
-    return NULL_TREE;
-  tree call = TREE_OPERAND (TREE_OPERAND (pointer, 0), 0);
+  tree call = TREE_OPERAND (pointer, 0);
   if (!is_access_with_size_p (call))
     return NULL_TREE;
 
   /* Get the pointee type of the call to .ACCESS_WITH_SIZE.
      This should be the element type of the pointer array.  */
-  tree pointee_type = TREE_TYPE (TREE_TYPE (TREE_TYPE (call)));
+  tree pointee_type = TREE_TYPE (TREE_TYPE (call));
   tree pointee_size = TYPE_SIZE_UNIT (pointee_type);
 
   tree index_exp = TREE_OPERAND (pointer, 1);
@@ -758,11 +755,11 @@ get_index_from_pointer_addr_expr (tree pointer, tree 
*index_p, int *index_n)
 /* Return TRUE when the EXPR is a pointer array address that could be
    instrumented.
    We only instrument an address computation similar as the following:
-      *.ACCESS_WITH_SIZE (p->c, &p->b, 1, 0, -1, 0B)
+      .ACCESS_WITH_SIZE (p->c, &p->b, 1, 0, -1, 0B)
        + (sizetype) ((long unsigned int) index * 4)
    if the EXPR is instrumentable, return TRUE and
    set the index to *INDEX.
-   set the *.ACCESS_WITH_SIZE to *BASE.
+   set the .ACCESS_WITH_SIZE to *BASE.
    set the parent tree of INDEX to *INDEX_P.
    set the operand position of the INDEX in the parent tree to INDEX_N.  */
 
@@ -772,17 +769,15 @@ is_instrumentable_pointer_array_address (tree expr, tree 
*base,
                                         int *index_n)
 {
   /* For a poiner array address as:
-       (*.ACCESS_WITH_SIZE (p->c, &p->b, 1, 0, -1, 0B)
+       .ACCESS_WITH_SIZE (p->c, &p->b, 1, 0, -1, 0B)
          + (sizetype) ((long unsigned int) index * 4)
-     op0 is the call to *.ACCESS_WITH_SIZE;
+     op0 is the call to .ACCESS_WITH_SIZE;
      op1 is the index.  */
   if (TREE_CODE (expr) != POINTER_PLUS_EXPR)
     return false;
 
   tree op0 = TREE_OPERAND (expr, 0);
-  if (TREE_CODE (op0) != INDIRECT_REF)
-    return false;
-  if (!is_access_with_size_p (TREE_OPERAND (op0, 0)))
+  if (!is_access_with_size_p (op0))
     return false;
   tree op1 = get_index_from_pointer_addr_expr (expr, index_p, index_n);
   if (op1 != NULL_TREE)
-- 
2.43.5

The whole patch set has been bootstrapped and regression tested on both aarch64 
and x86.

Okay for trunk?
 
Thanks a lot.

Qing

Reply via email to