> On Jan 29, 2024, at 10:50 AM, Martin Uecker <uec...@tugraz.at> wrote: > > Am Montag, dem 29.01.2024 um 15:09 +0000 schrieb Qing Zhao: >> Thank you! >> >> Joseph and Richard, could you also comment on this? >> >>> On Jan 28, 2024, at 5:09 AM, Martin Uecker <uec...@tugraz.at> wrote: >>> >>> Am Freitag, dem 26.01.2024 um 14:33 +0000 schrieb Qing Zhao: >>>> >>>>> On Jan 26, 2024, at 3:04 AM, Martin Uecker <uec...@tugraz.at> wrote: >>>>> >>>>> >>>>> I haven't looked at the patch, but it sounds you give the result >>>>> the wrong type. Then patching up all use cases instead of the >>>>> type seems wrong. >>>> >>>> Yes, this is for resolving a very early gimplification issue as I reported >>>> last Nov: >>>> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html >>>> >>>> Since no-one responded at that time, I fixed the issue by replacing the >>>> ARRAY_REF >>>> With a pointer indirection: >>>> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html >>>> >>>> The reason for such change is: return a flexible array member TYPE is not >>>> allowed >>>> by C language (our gimplification follows this rule), so, we have to >>>> return a pointer TYPE instead. >>>> >>>> ******The new internal function >>>> >>>> .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE, >>>> ACCESS_MODE, INDEX) >>>> >>>> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL) >>>> >>>> which returns the "REF_TO_OBJ" same as the 1st argument; >>>> >>>> Both the return type and the type of the first argument of this function >>>> have been converted from >>>> the incomplete array type to the corresponding pointer type. >>>> >>>> As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the >>>> original INDEX of the ARRAY_REF was lost >>>> when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX >>>> for bound sanitizer instrumentation, I added >>>> The 6th argument “INDEX”. >>>> >>>> What’s your comment and suggestion on this solution? >>> >>> I am not entirely sure but changing types in the FE seems >>> problematic because this breaks language semantics. And >>> then adding special code everywhere to treat it specially >>> in the FE does not seem a good way forward. >>> >>> If I understand correctly, returning an incomplete array >>> type is not allowed and then fails during gimplification. >> >> Yes, this is the problem in gimplification. >> >>> So I would suggest to make it return a pointer to the >>> incomplete array (and not the element type) >> >> >> for the following: >> >> struct annotated { >> unsigned int size; >> int array[] __attribute__((counted_by (size))); >> }; >> >> struct annotated * p = …. >> p->array[9] = 0; >> >> The IL for the above array reference p->array[9] is: >> >> 1. If the return type is the original incomplete array type, >> >> .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0; >> >> (this triggered the gimplification failure since the return type cannot be a >> complete type). >> >> 2. When the return type is changed to a pointer to the element type of the >> incomplete array, (the current patch) >> Then the original array reference naturally becomes an indirect reference >> through the pointer >> >> *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) = 0; >> >> Since the original array reference becomes an indirect reference through the >> pointer to the element array, the INDEX info >> is mixed into the OFFSET of the indirect reference and lost, so, I added the >> 6th argument to the routine .ACCESS_WITH_SIZE >> to record the INDEX. >> >> 3. With your suggestion, the return type is changed to a pointer to the >> incomplete array, >> I just tried this to change the result type : >> >> >> --- a/gcc/c/c-typeck.cc >> +++ b/gcc/c/c-typeck.cc >> @@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, >> tree ref, >> tree counted_by_type) >> { >> gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref))); >> - tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref))); >> + tree result_type = build_pointer_type (TREE_TYPE (ref)); >> >> Then, I got the following FE errors: >> >> test.c:10:11: error: invalid use of flexible array member >> 10 | p->array[9] = 0; >> >> The reason for the error is: when the original array_ref becomes an >> indirect_ref through the pointer to the incomplete array, >> During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT >> (type) is invalid since the type is an incomplete array. >> As a result, the OFFSET cannot computed for the indirect_ref. >> >> Looks like even more issues with this approach. > > Yes, but only because the following is missing: > >> >> >>> but then wrap >>> it with an indirection when inserting this code in the FE >>> so that the full replacement has the correct type again >>> (of the incomplete array). >> >> I don’t quite understand the above, could you please explain this in more >> details? (If possible, could you please use the above small example?) >> thanks. > > You would need to add an indirection: > > (*(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)))[9] = 0; > > if .ACCESS_WITH_SIZE has type T (*)[], i.e. pointer to incomplete > array of type T, then (*(.ACCESS_WITH_SIZE (...))) has type T[], i.e. > incomplete array of type. > > And you shouldn't even consider array derefencing part at all, > but replace the p->array when the component ref is constructed.
Thanks, I see now. I just updated the routine “build_access_with_size_for_counted_by” as following: --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc, tree ref, tree counted_by_type) { gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref))); - tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref))); + tree result_type = build_pointer_type (TREE_TYPE (ref)); unsigned int counted_by_precision = TYPE_PRECISION (counted_by_type); tree call @@ -2632,6 +2632,7 @@ build_access_with_size_for_counted_by (location_t loc, tree ref, counted_by_precision), build_int_cst (integer_type_node, -1), build_int_cst (integer_type_node, -1)); + call = build1 (INDIRECT_REF, TREE_TYPE (ref), call); SET_EXPR_LOCATION (call, loc); return call; } This works for the small testing case, the generated IR is: (*.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, -1))[9] = 0; I will do more testings and adjust my patch on this change. Thanks a lot for your help. Qing > > Martin > > > >> >>> >>> >>> Alternatively, one could allow this during gimplification >>> or add some conversion. >> >> Allowing this in gimplification might trigger some other issues. I guess >> that adding conversion >> in the end of the FE or in the beginning of gimplification might be better. >> >> i.e, in FE, still keep the original incomplete array type as the return >> type for the routine .ACCESS_WITH_SIZE, i.e >> >> .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0; >> >> But add a conversion from the above array_ref to an indirect_ref in the end >> of FE or in the beginning of gimplification: >> >> *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1) + 36) = 0; >> >> With this approach, during FE, the original ARRAY TYPE and the INDEX can be >> kept, the array bound sanitizer instrumentation >> Will be much easier than my current approach. >> >> Is this approach reasonable? >> >> If so, where is better to add this conversion, in the end of FE or in the >> beginning of gimplification? >> >> Thanks. >> >> Qing >> >> >>> >>> Martin >>> >>> >>>> >>>> Thanks. >>>> >>>> Qing >>>> >>>> >>>>> >>>>> Martin >>>>> >>>>> >>>>> Am Donnerstag, dem 25.01.2024 um 20:11 +0000 schrieb Qing Zhao: >>>>>> Thanks a lot for the testing. >>>>>> >>>>>> Yes, I can repeat the issue with the following small example: >>>>>> >>>>>> #include <stdlib.h> >>>>>> #include <stddef.h> >>>>>> #include <stdio.h> >>>>>> >>>>>> #define MAX(a, b) ((a) > (b) ? (a) : (b)) >>>>>> >>>>>> struct untracked { >>>>>> int size; >>>>>> int array[] __attribute__((counted_by (size))); >>>>>> } *a; >>>>>> struct untracked * alloc_buf (int index) >>>>>> { >>>>>> struct untracked *p; >>>>>> p = (struct untracked *) malloc (MAX (sizeof (struct untracked), >>>>>> (offsetof (struct untracked, >>>>>> array[0]) >>>>>> + (index) * sizeof (int)))); >>>>>> p->size = index; >>>>>> return p; >>>>>> } >>>>>> >>>>>> int main() >>>>>> { >>>>>> a = alloc_buf(10); >>>>>> printf ("same_type is %d\n", >>>>>> (__builtin_types_compatible_p(typeof (a->array), typeof >>>>>> (&(a->array)[0])))); >>>>>> return 0; >>>>>> } >>>>>> >>>>>> >>>>>> /home/opc/Install/latest-d/bin/gcc -O2 btcp.c >>>>>> same_type is 1 >>>>>> >>>>>> Looks like that the “typeof” operator need to be handled specially in C >>>>>> FE >>>>>> for the new internal function .ACCESS_WITH_SIZE. >>>>>> >>>>>> (I have specially handle the operator “offsetof” in C FE already). >>>>>> >>>>>> Will fix this issue. >>>>>> >>>>>> Thanks. >>>>>> >>>>>> Qing >>>>>> >>>>>>> On Jan 24, 2024, at 7:51 PM, Kees Cook <keesc...@chromium.org> wrote: >>>>>>> >>>>>>> On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote: >>>>>>>> This is the 4th version of the patch. >>>>>>> >>>>>>> Thanks very much for this! >>>>>>> >>>>>>> I tripped over an unexpected behavioral change that the Linux kernel >>>>>>> depends on: >>>>>>> >>>>>>> __builtin_types_compatible_p() no longer treats an array marked with >>>>>>> counted_by as different from that array's decayed pointer. Specifically, >>>>>>> the kernel uses these macros: >>>>>>> >>>>>>> >>>>>>> /* >>>>>>> * Force a compilation error if condition is true, but also produce a >>>>>>> * result (of value 0 and type int), so the expression can be used >>>>>>> * e.g. in a structure initializer (or where-ever else comma expressions >>>>>>> * aren't permitted). >>>>>>> */ >>>>>>> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); }))) >>>>>>> >>>>>>> #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), >>>>>>> typeof(b)) >>>>>>> >>>>>>> /* &a[0] degrades to a pointer: a different type from an array */ >>>>>>> #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), >>>>>>> &(a)[0])) >>>>>>> >>>>>>> >>>>>>> This gets used in various places to make sure we're dealing with an >>>>>>> array for a macro: >>>>>>> >>>>>>> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + >>>>>>> __must_be_array(arr)) >>>>>>> >>>>>>> >>>>>>> So this builds: >>>>>>> >>>>>>> struct untracked { >>>>>>> int size; >>>>>>> int array[]; >>>>>>> } *a; >>>>>>> >>>>>>> __must_be_array(a->array) >>>>>>> => 0 (as expected) >>>>>>> __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0])) >>>>>>> => 0 (as expected, array vs decayed array pointer) >>>>>>> >>>>>>> >>>>>>> But if counted_by is added, we get a build failure: >>>>>>> >>>>>>> struct tracked { >>>>>>> int size; >>>>>>> int array[] __counted_by(size); >>>>>>> } *b; >>>>>>> >>>>>>> __must_be_array(b->array) >>>>>>> => build failure (not expected) >>>>>>> __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0])) >>>>>>> => 1 (not expected, both pointers?) >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Kees Cook >> >> > > -- > Univ.-Prof. Dr. rer. nat. Martin Uecker > Graz University of Technology > Institute of Biomedical Imaging