> On Apr 7, 2025, at 13:43, Siddhesh Poyarekar <siddh...@gotplt.org> wrote: > > On 2025-04-07 11:56, Qing Zhao wrote: >> when the type is a structure with flexible array member. > > Not just when the structure has a flexible array member, also when the FAM > size is recorded in a __counted_by__, isn't it?
Yes, you are right. in the code, I included all these checks. > >> In tree-object-size.cc, if the size is UNKNOWN after evaluating use-def >> chain, We can evaluate the SIZE of the pointee TYPE ONLY when this TYPE >> is a structure type with flexible array member, since a structure with >> FAM can not be an element of an array, so, the pointer must point to a >> single object with this structure with FAM. >> This is only available for C now. >> bootstrapped and regression tested on both x86 and aarch64. >> Okay for stage1? >> thanks. >> Qing >> gcc/c/ChangeLog: >> * c-lang.cc (LANG_HOOKS_BUILD_COUNTED_BY_REF): >> Define to below function. >> * c-tree.h (c_build_counted_by_ref): New extern function. >> * c-typeck.cc (build_counted_by_ref): Rename to ... >> (c_build_counted_by_ref): ...this. >> (handle_counted_by_for_component_ref): Call the renamed function. >> gcc/ChangeLog: >> * langhooks-def.h (LANG_HOOKS_BUILD_COUNTED_BY_REF): >> New language hook. >> * langhooks.h (struct lang_hooks_for_types): Add >> build_counted_by_ref. >> * tree-object-size.cc (struct object_size_info): Add a new field >> insert_after. >> (gimplify_size_expressions): Insert sequence after or before >> depending on the new field insert_after. >> (compute_builtin_object_size): Init the new field to false; >> (record_with_fam_object_size): New function. >> (collect_object_sizes_for): Call record_with_fam_object_size. >> gcc/testsuite/ChangeLog: >> * gcc.dg/flex-array-counted-by-3.c: Update test; >> * gcc.dg/flex-array-counted-by-4.c: Likewise. >> * gcc.dg/flex-array-counted-by-5.c: Likewise. >> --- >> gcc/c/c-lang.cc | 3 + >> gcc/c/c-tree.h | 1 + >> gcc/c/c-typeck.cc | 6 +- >> gcc/langhooks-def.h | 4 +- >> gcc/langhooks.h | 5 + >> .../gcc.dg/flex-array-counted-by-3.c | 5 + >> .../gcc.dg/flex-array-counted-by-4.c | 34 ++++-- >> .../gcc.dg/flex-array-counted-by-5.c | 4 + >> gcc/tree-object-size.cc | 106 +++++++++++++++++- >> 9 files changed, 152 insertions(+), 16 deletions(-) >> diff --git a/gcc/c/c-lang.cc b/gcc/c/c-lang.cc >> index c69077b2a93..e9ec9e6e64a 100644 >> --- a/gcc/c/c-lang.cc >> +++ b/gcc/c/c-lang.cc >> @@ -51,6 +51,9 @@ enum c_language_kind c_language = clk_c; >> #undef LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE >> #define LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE c_get_sarif_source_language >> +#undef LANG_HOOKS_BUILD_COUNTED_BY_REF >> +#define LANG_HOOKS_BUILD_COUNTED_BY_REF c_build_counted_by_ref >> + >> /* Each front end provides its own lang hook initializer. */ >> struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER; >> diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h >> index 743ec5cbae6..66bec5d92fa 100644 >> --- a/gcc/c/c-tree.h >> +++ b/gcc/c/c-tree.h >> @@ -776,6 +776,7 @@ extern struct c_switch *c_switch_stack; >> extern bool null_pointer_constant_p (const_tree); >> +extern tree c_build_counted_by_ref (tree, tree, tree *); >> inline bool >> c_type_variably_modified_p (tree t) >> diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc >> index 19e79b554dc..6efc3fb3e5d 100644 >> --- a/gcc/c/c-typeck.cc >> +++ b/gcc/c/c-typeck.cc >> @@ -2936,8 +2936,8 @@ should_suggest_deref_p (tree datum_type) >> &(p->k) >> */ >> -static tree >> -build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type) >> +tree >> +c_build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type) >> { >> tree type = TREE_TYPE (datum); >> if (!c_flexible_array_member_type_p (TREE_TYPE (subdatum))) >> @@ -3031,7 +3031,7 @@ handle_counted_by_for_component_ref (location_t loc, >> tree ref) >> tree datum = TREE_OPERAND (ref, 0); >> tree subdatum = TREE_OPERAND (ref, 1); >> tree counted_by_type = NULL_TREE; >> - tree counted_by_ref = build_counted_by_ref (datum, subdatum, >> + tree counted_by_ref = c_build_counted_by_ref (datum, subdatum, >> &counted_by_type); >> if (counted_by_ref) >> ref = build_access_with_size_for_counted_by (loc, ref, >> diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h >> index 6b34d324ab5..7909ea8a92c 100644 >> --- a/gcc/langhooks-def.h >> +++ b/gcc/langhooks-def.h >> @@ -221,6 +221,7 @@ extern tree lhd_unit_size_without_reusable_padding >> (tree); >> #define LANG_HOOKS_TYPE_DWARF_ATTRIBUTE lhd_type_dwarf_attribute >> #define LANG_HOOKS_UNIT_SIZE_WITHOUT_REUSABLE_PADDING >> lhd_unit_size_without_reusable_padding >> #define LANG_HOOKS_CLASSTYPE_AS_BASE hook_tree_const_tree_null >> +#define LANG_HOOKS_BUILD_COUNTED_BY_REF NULL >> #define LANG_HOOKS_FOR_TYPES_INITIALIZER { \ >> LANG_HOOKS_MAKE_TYPE, \ >> @@ -248,7 +249,8 @@ extern tree lhd_unit_size_without_reusable_padding >> (tree); >> LANG_HOOKS_GET_FIXED_POINT_TYPE_INFO, \ >> LANG_HOOKS_TYPE_DWARF_ATTRIBUTE, \ >> LANG_HOOKS_UNIT_SIZE_WITHOUT_REUSABLE_PADDING, \ >> - LANG_HOOKS_CLASSTYPE_AS_BASE \ >> + LANG_HOOKS_CLASSTYPE_AS_BASE, \ >> + LANG_HOOKS_BUILD_COUNTED_BY_REF \ >> } >> /* Declaration hooks. */ >> diff --git a/gcc/langhooks.h b/gcc/langhooks.h >> index dc8d878c7fb..797c8bcc0c0 100644 >> --- a/gcc/langhooks.h >> +++ b/gcc/langhooks.h >> @@ -190,6 +190,11 @@ struct lang_hooks_for_types >> i.e., type without virtual base classes or tail padding. Returns >> NULL_TREE otherwise. */ >> tree (*classtype_as_base) (const_tree); >> + >> + /* Build a REF to the object that represents the counted_by per the >> attribute >> + counted_by attached to the field. Otherwise return NULL_TREE. Set the >> + TYPE of the counted_by field to the last parameter. */ >> + tree (*build_counted_by_ref) (tree, tree, tree *); >> }; >> /* Language hooks related to decls and the symbol table. */ >> diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c >> b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c >> index 78f50230e89..88ad50bea6b 100644 >> --- a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c >> +++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c >> @@ -53,6 +53,11 @@ void __attribute__((__noinline__)) test () >> array_annotated->b * sizeof (int)); >> EXPECT(__builtin_dynamic_object_size(array_nested_annotated->c, 1), >> array_nested_annotated->b * sizeof (int)); >> + EXPECT(__builtin_dynamic_object_size(array_annotated, 1), >> + sizeof (struct annotated) + array_annotated->b * sizeof (int)); >> + EXPECT(__builtin_dynamic_object_size(array_nested_annotated, 1), >> + sizeof (struct nested_annotated) >> + + array_nested_annotated->b * sizeof (int)); >> } >> int main(int argc, char *argv[]) >> diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c >> b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c >> index 20103d58ef5..6ac2be4471d 100644 >> --- a/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c >> +++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c >> @@ -23,7 +23,15 @@ struct annotated { >> So, __builtin_object_size can not directly use the type of the pointee >> to decide the size of the object the pointer points to. >> - There are only two reliable ways: >> + However, if the pointer points to a strucure with FAM, and the FAM is >> + annotated with counted_by attribute, we can get the size of the pointee >> + object by the size of the structure type and the counted_by attribute >> + since structure with FAM cannot be an element of an array, the pointer >> + that points to such type must point to a single object with the type, >> + therefore, we can decide the size of the object from the size of the >> + type for the pointee. >> + >> + There are other two reliable ways: >> A. observed allocations (call to the allocation functions in the >> routine) >> B. observed accesses (read or write access to the location of the >> pointer points to) >> @@ -155,11 +163,13 @@ int main () >> EXPECT(__builtin_dynamic_object_size(p->array, 2), p->foo * sizeof(char)); >> EXPECT(__builtin_dynamic_object_size(p->array, 3), p->foo * sizeof(char)); >> /* When checking the pointer p, we have no observed allocation nor >> observed >> - access, therefore, we cannot determine the size info here. */ >> - EXPECT(__builtin_dynamic_object_size(p, 0), -1); >> - EXPECT(__builtin_dynamic_object_size(p, 1), -1); >> - EXPECT(__builtin_dynamic_object_size(p, 2), 0); >> - EXPECT(__builtin_dynamic_object_size(p, 3), 0); >> + access, however, since the pointer points to a strucure with FAM, and >> the >> + FAM is annotated with counted_by attribute, we can get the size of the >> + pointee object by the size of the TYPE and the counted_by attribute. */ >> + EXPECT(__builtin_dynamic_object_size(p, 0), 19); >> + EXPECT(__builtin_dynamic_object_size(p, 1), 19); >> + EXPECT(__builtin_dynamic_object_size(p, 2), 19); >> + EXPECT(__builtin_dynamic_object_size(p, 3), 19); >> /* When checking the access p->array, we only have info on the >> counted-by >> value. */ >> @@ -168,11 +178,13 @@ int main () >> EXPECT(__builtin_dynamic_object_size(q->array, 2), q->foo * sizeof(char)); >> EXPECT(__builtin_dynamic_object_size(q->array, 3), q->foo * sizeof(char)); >> /* When checking the pointer p, we have no observed allocation nor >> observed >> - access, therefore, we cannot determine the size info here. */ >> - EXPECT(__builtin_dynamic_object_size(q, 0), -1); >> - EXPECT(__builtin_dynamic_object_size(q, 1), -1); >> - EXPECT(__builtin_dynamic_object_size(q, 2), 0); >> - EXPECT(__builtin_dynamic_object_size(q, 3), 0); >> + access, however, since the pointer points to a strucure with FAM, and >> the >> + FAM is annotated with counted_by attribute, we can get the size of the >> + pointee object by the size of the TYPE and the counted_by attribute. */ >> + EXPECT(__builtin_dynamic_object_size(q, 0), 29); >> + EXPECT(__builtin_dynamic_object_size(q, 1), 29); >> + EXPECT(__builtin_dynamic_object_size(q, 2), 29); >> + EXPECT(__builtin_dynamic_object_size(q, 3), 29); >> DONE (); >> } >> diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-5.c >> b/gcc/testsuite/gcc.dg/flex-array-counted-by-5.c >> index 68f9b0f7c8d..44bfce9976f 100644 >> --- a/gcc/testsuite/gcc.dg/flex-array-counted-by-5.c >> +++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-5.c >> @@ -38,6 +38,10 @@ void __attribute__((__noinline__)) test () >> { >> EXPECT(__builtin_dynamic_object_size(array_annotated->c, 1), 0); >> EXPECT(__builtin_dynamic_object_size(array_nested_annotated->c, 1), 0); >> + EXPECT(__builtin_dynamic_object_size(array_annotated, 1), >> + sizeof (struct annotated)); >> + EXPECT(__builtin_dynamic_object_size(array_nested_annotated, 1), >> + sizeof (struct nested_annotated)); >> } >> int main(int argc, char *argv[]) >> diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc >> index f348673ae75..5ff2da18f50 100644 >> --- a/gcc/tree-object-size.cc >> +++ b/gcc/tree-object-size.cc >> @@ -30,6 +30,7 @@ along with GCC; see the file COPYING3. If not see >> #include "fold-const.h" >> #include "tree-object-size.h" >> #include "gimple-iterator.h" >> +#include "langhooks.h" >> #include "gimple-fold.h" >> #include "tree-cfg.h" >> #include "tree-dfa.h" >> @@ -45,6 +46,7 @@ struct object_size_info >> int object_size_type; >> unsigned char pass; >> bool changed; >> + bool insert_after; >> bitmap visited, reexamine; >> unsigned int *depths; >> unsigned int *stack, *tos; >> @@ -1174,7 +1176,10 @@ gimplify_size_expressions (object_size_info *osi) >> gsi = gsi_for_stmt (stmt); >> force_gimple_operand (size_expr, &seq, true, NULL); >> - gsi_insert_seq_before (&gsi, seq, GSI_CONTINUE_LINKING); >> + if (osi->insert_after) >> + gsi_insert_seq_after (&gsi, seq, GSI_CONTINUE_LINKING); >> + else >> + gsi_insert_seq_before (&gsi, seq, GSI_CONTINUE_LINKING); >> } >> } >> @@ -1279,6 +1284,7 @@ compute_builtin_object_size (tree ptr, int >> object_size_type, >> expression needs to be gimplified. */ >> osi.pass = 0; >> osi.changed = false; >> + osi.insert_after = false; >> collect_object_sizes_for (&osi, ptr); >> if (object_size_type & OST_DYNAMIC) >> @@ -1768,6 +1774,89 @@ phi_dynamic_object_size (struct object_size_info >> *osi, tree var) >> object_sizes_set (osi, varno, sizes, wholesizes); >> } >> +/* Compute an object size expression for VAR, whose pointee TYPE is a >> + structure with flexible array member. */ >> + >> +static void >> +record_with_fam_object_size (struct object_size_info *osi, tree var) >> +{ >> + int object_size_type = osi->object_size_type; >> + tree size = size_unknown (object_size_type); >> + >> + tree pointee_type = TREE_TYPE (TREE_TYPE (var)); >> + tree counted_by_ref = NULL_TREE; >> + tree counted_by_type = NULL_TREE; >> + tree last = NULL_TREE; >> + >> + if (TREE_CODE (pointee_type) == RECORD_TYPE >> + && flexible_array_type_p (pointee_type)) >> + { >> + for (tree x = TYPE_FIELDS (pointee_type); x != NULL_TREE; >> + x = DECL_CHAIN (x)) >> + if (TREE_CODE (x) == FIELD_DECL >> + && TREE_CODE (TREE_TYPE (x)) == ARRAY_TYPE >> + && TYPE_SIZE (TREE_TYPE (x)) == NULL_TREE >> + && TYPE_DOMAIN (TREE_TYPE (x)) != NULL_TREE >> + && TYPE_MAX_VALUE (TYPE_DOMAIN (TREE_TYPE (x))) == NULL_TREE >> + && DECL_CHAIN (x) == NULL_TREE) >> + last = x; >> + /* Check whether the last FAM field has a counted_by attribute. >> + If so, build a counted_by reference. */ >> + if (last >> + && lookup_attribute ("counted_by", DECL_ATTRIBUTES (last)) >> + && lang_hooks.types.build_counted_by_ref) >> + { >> + tree datum = build1 (INDIRECT_REF, pointee_type, var); >> + counted_by_ref >> + = lang_hooks.types.build_counted_by_ref (datum, last, >> + &counted_by_type); >> + } >> + } >> + >> + /* If the counted_by reference is available, the size of the whole >> structure >> + can be computed. */ >> + if (counted_by_ref) >> + { >> + tree element_type = TREE_TYPE (TREE_TYPE (last)); >> + tree element_size = TYPE_SIZE_UNIT (element_type); >> + size = fold_build2 (MEM_REF, counted_by_type, counted_by_ref, >> + build_int_cst (ptr_type_node, 0)); >> + /* If counted_by is a negative value, treat it as zero. */ >> + if (!TYPE_UNSIGNED (counted_by_type)) >> + { >> + tree cond_expr = fold_build2 (LT_EXPR, boolean_type_node, >> + unshare_expr (size), >> + build_zero_cst (counted_by_type)); >> + size = fold_build3 (COND_EXPR, integer_type_node, cond_expr, >> + build_zero_cst (counted_by_type), size); >> + } >> + >> + /* The total size of the whole object is computed as: >> + MAX (sizeof (pointee_type), >> + offsetof (pointee_type, last) + counted_by * element_size). */ >> + size = size_binop (MULT_EXPR, >> + fold_convert (sizetype, size), >> + fold_convert (sizetype, element_size)); >> + size = size_binop (PLUS_EXPR, >> + byte_position (last), >> + size); >> + size = size_binop (MAX_EXPR, >> + TYPE_SIZE_UNIT (pointee_type), >> + size); >> + if (!todo) >> + todo = TODO_update_ssa_only_virtuals; >> + >> + } >> + /* Initialize to 0 for maximum size and M1U for minimum size so that >> + it gets immediately overridden. */ >> + object_sizes_initialize (osi, SSA_NAME_VERSION (var), >> + size_initval (object_size_type), >> + size_initval (object_size_type)); >> + >> + osi->insert_after = true; >> + object_sizes_set (osi, SSA_NAME_VERSION (var), size, size); >> +} >> + >> /* Compute object sizes for VAR. >> For ADDR_EXPR an object size is the number of remaining bytes >> to the end of the object (where what is considered an object depends on >> @@ -1922,6 +2011,21 @@ collect_object_sizes_for (struct object_size_info >> *osi, tree var) >> gcc_unreachable (); >> } >> + /* If the size is UNKNOWN after evaluating use-def chain, We can evaluate >> + * the SIZE of the pointee TYPE ONLY when this TYPE is a structure type >> + * with flexible array member, Since a structure with FAM can not be an >> + * element of an array, so, PTR must point to an single object with this >> + * strucure with FAM. */ > > The comment is imprecise because it's not just FAM, it's FAM's size being > recorded in a __counted_by__. Right. Will update the comments. > > FWIW, there's also a case for MINIMUM object size even when __counted_by__ is > not available, when the pointer can be proven as non-NULL. I don't know > however if that's a very interesting use case. Oh, yeah, that’s right. I can include this case as well. > >> + if (object_sizes_unknown_p (object_size_type, varno) >> + && object_size_type & OST_DYNAMIC >> + && POINTER_TYPE_P (TREE_TYPE (var))) >> + { >> + const_tree pointee_type = TREE_TYPE (TREE_TYPE (var)); >> + if (flexible_array_type_p (pointee_type)) >> + record_with_fam_object_size (osi, var); >> + } > > Is there a reason to do this at the very end like this and not in the > GIMPLE_ASSIGN case in the switch block? Something like this: > > tree rhs = gimple_assign_rhs1 (stmt); > tree counted_by_ref = NULL_TREE; > if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR > || (gimple_assign_rhs_code (stmt) == ADDR_EXPR > && TREE_CODE (TREE_OPERAND (rhs, 0)) == MEM_REF)) > reexamine = plus_stmt_object_size (osi, var, stmt); > else if (gimple_assign_rhs_code (stmt) == COND_EXPR) > reexamine = cond_expr_object_size (osi, var, stmt); > else if (gimple_assign_single_p (stmt) > || gimple_assign_unary_nop_p (stmt)) > { > if (TREE_CODE (rhs) == SSA_NAME > && POINTER_TYPE_P (TREE_TYPE (rhs))) > reexamine = merge_object_sizes (osi, var, rhs); > else > expr_object_size (osi, var, rhs); > } > + else if ((counted_by_ref = fam_struct_with_counted_by (rhs))) > + record_fam_object_size (osi, var, counted_by_ref); > else > unknown_object_size (osi, var); > > where you can then fold in all your gating conditions, including getting the > counted_by ref into the fam_struct_with_counted_by and then limit the > record_fam_object_size to just evaluating the type size + counted_by size. > > This may even help avoid the insertion order hack you have to do, i.e. the > gsi_insert_seq_before vs gsi_insert_seq_after. This is a good suggestion. I will try this to see any issue there. My initial thought is to give the counted_by information the lowest priority if there are other information (for example, malloc) available. Do you see any issue here? > > Also, it seems like simply making build_counted_by_ref available may be > unnecessary and maybe you could explore associating the counted_by > component_ref with the parent type somehow. Alternatively, how about > building an .ACCESS_WITH_SIZE for types with FAM that have a counted_by? > That would allow the current access_with_size_object_size() to work out of > the box. I am not sure about this though. Our initial design is to change every component_ref (which is a reference to the FAM) in the data flow of the routine that has a counted_by attribute to a call to .ACCESS_WITH_SIZE. Then put this call to .ACCESS_WITH_SIZE into the data flow of the routine. Now, if we try to associate counted_by information to the parent type, how can we add such information To the data flow of the routine if there is no explicit reference to the array itself? Thanks. Qing > >> + >> /* Dynamic sizes use placeholder temps to return an answer, so it is >> always >> safe to set COMPUTED for them. */ >> if ((object_size_type & OST_DYNAMIC) > > Thanks, > Sid