Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

Qing Zhao Thu, 26 Oct 2023 09:41:29 -0700


> On Oct 26, 2023, at 5:20 AM, Martin Uecker <[email protected]> wrote:
> 
> Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
>> On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker <[email protected]> wrote:
>>> 
>>> Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
>>>> 
>>>>> Am 25.10.2023 um 12:47 schrieb Martin Uecker <[email protected]>:
>>>>> 
>>>>> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
>>>>>>> On 2023-10-25 04:16, Martin Uecker wrote:
>>>>>>> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>>>>>>>> 
>>>>>>>>> Am 24.10.2023 um 22:38 schrieb Martin Uecker <[email protected]>:
>>>>>>>>> 
>>>>>>>>> Am Dienstag, dem 24.10.2023 um 20:30 +0000 schrieb Qing Zhao:
>>>>>>>>>> Hi, Sid,
>>>>>>>>>> 
>>>>>>>>>> Really appreciate for your example and detailed explanation. Very 
>>>>>>>>>> helpful.
>>>>>>>>>> I think that this example is an excellent example to show (almost) 
>>>>>>>>>> all the issues we need to consider.
>>>>>>>>>> 
>>>>>>>>>> I slightly modified this example to make it to be compilable and 
>>>>>>>>>> run-able, as following:
>>>>>>>>>> (but I still cannot make the incorrect reordering or DSE happening, 
>>>>>>>>>> anyway, the potential reordering possibility is there…)
>>>>>>>>>> 
>>>>>>>>>> 1 #include <malloc.h>
>>>>>>>>>> 2 struct A
>>>>>>>>>> 3 {
>>>>>>>>>> 4  size_t size;
>>>>>>>>>> 5  char buf[] __attribute__((counted_by(size)));
>>>>>>>>>> 6 };
>>>>>>>>>> 7
>>>>>>>>>> 8 static size_t
>>>>>>>>>> 9 get_size_from (void *ptr)
>>>>>>>>>> 10 {
>>>>>>>>>> 11  return __builtin_dynamic_object_size (ptr, 1);
>>>>>>>>>> 12 }
>>>>>>>>>> 13
>>>>>>>>>> 14 void
>>>>>>>>>> 15 foo (size_t sz)
>>>>>>>>>> 16 {
>>>>>>>>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>>>>>>>>>> sizeof(char));
>>>>>>>>>> 18  obj->size = sz;
>>>>>>>>>> 19  obj->buf[0] = 2;
>>>>>>>>>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>>>>>>>>>> 21  return;
>>>>>>>>>> 22 }
>>>>>>>>>> 23
>>>>>>>>>> 24 int main ()
>>>>>>>>>> 25 {
>>>>>>>>>> 26  foo (20);
>>>>>>>>>> 27  return 0;
>>>>>>>>>> 28 }
>>>>>>>>>> 
>>>>>> 
>>>>>> <snip>
>>>>>> 
>>>>>>>> When it’s set I suppose.  Turn
>>>>>>>> 
>>>>>>>> X.l = n;
>>>>>>>> 
>>>>>>>> Into
>>>>>>>> 
>>>>>>>> X.l = __builtin_with_size (x.buf, n);
>>>>>>> 
>>>>>>> It would turn
>>>>>>> 
>>>>>>> some_variable = (&) x.buf
>>>>>>> 
>>>>>>> into
>>>>>>> 
>>>>>>> some_variable = __builtin_with_size ( (&) x.buf. x.len)
>>>>>>> 
>>>>>>> 
>>>>>>> So the later access to x.buf and not the initialization
>>>>>>> of a member of the struct (which is too early).
>>>>>>> 
>>>>>> 
>>>>>> Hmm, so with Qing's example above, are you suggesting the transformation
>>>>>> be to foo like so:
>>>>>> 
>>>>>> 14 void
>>>>>> 15 foo (size_t sz)
>>>>>> 16 {
>>>>>> 16.5  void * _1;
>>>>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>>>>>> sizeof(char));
>>>>>> 18  obj->size = sz;
>>>>>> 19  obj->buf[0] = 2;
>>>>>> 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
>>>>>> 20  __builtin_printf (“%d\n", get_size_from (_1));
>>>>>> 21  return;
>>>>>> 22 }
>>>>>> 
>>>>>> If yes then this could indeed work.  I think I got thrown off by the
>>>>>> reference to __bdos.
>>>>> 
>>>>> Yes. I think it is important not to evaluate the size at the
>>>>> access to buf and not the allocation, because the point is to
>>>>> recover it from the size member even when the compiler can't
>>>>> see the original allocation.
>>>> 
>>>> But if the access is through a pointer without the attribute visible
>>>> even the Frontend cannot recover?
>>> 
>>> Yes, if the access is using a struct-with-FAM without the attribute
>>> the FE would not be insert the builtin.  BDOS could potentially
>>> still see the original allocation but if it doesn't, then there is
>>> no information.
>>> 
>>>> We’d need to force type correctness and give up on indirecting
>>>> through an int * when it can refer to two diffenent container types.
>>>> The best we can do I think is mark allocation sites and hope for
>>>> some basic code hygiene (not clobbering size or array pointer
>>>> through pointers without the appropriately attributed type)
>>> 
>>> I am do not fully understand what you are referring to.
>> 
>> struct A { int n; int data[n]; };
>> struct B { long n; int data[n]; };
>> 
>> int *p = flag ? a->data : b->data;
>> 
>> access *p;
>> 
>> Since we need to allow interoperability of pointers (a->data is
>> convertible to a non-fat pointer of type int *) this leaves us with
>> ambiguity we need to conservatively handle to avoid false positives.
> 
> For BDOS, I would expect this to work exactly like:
> 
> char aa[n1];
> char bb[n2];
> char *p = flag ? aa : bb;
> 
> (or similar code with malloc). In fact it does:
> 
> https://godbolt.org/z/bK68YKqhe
> (cheating a bit and also the sub-object version of
> BDOS does not seem to work)
> 
>> 
>> We _might_ want to diagnose decay of a->data to int *, but IIRC
>> there's no way (or proposal) to allow declaring a corresponding
>> fat pointer, so it's not a good designed feature.
> 
> As a language feature, I fully agree.  I see the
> counted_by attribute has a makeshift solution.


The “counted_by” attribute is necessary at this moment since
 it will be much easier to be adopted by the existing source code,
 for example, the Linux Kernel. 

Though I agree that embedding the bound information into TYPE 
system  should be the ultimate goal. 

> 
> But we can already do:
> 
> auto p = flag ? &aa : &bb;
> 
> and this already works perfectly:
> 
> https://godbolt.org/z/rvb6xWWPj
> 
> We can also name the variably-modifed type: 
> 
> char (*p)[flag ? n1 : n2] = flag ? &aa : &bb;
> https://godbolt.org/z/13cTT1vGP
> 
> The problem with this version is that consistency
> is not checked. (I have patch for adding run-time
> checks).
> 
> And then the next step would be to allow
> 
> char (*p)[:] = flag ? &aa : &bb;
> 
> or similar.  Dennis Ritchie proposed this himself
> a long time ago.
> 
> So far this seems straightfoward.
> 
> If we then want to allow such wide pointers as
> function arguments or in structs, we would need
> to define an ABI. But the ABI could just be
> 
> struct { char (*p)[.s]; size_t s; };
> 
> Maybe we could try to make the following
> ABI compatible:
> 
> int foo(int p[s], size_t s);
> int foo(int p[:]);
> 
> 
>> Having __builtin_with_size at allocation would possibly make
>> the BOS use-def walk discover both objects.
> 
> Yes. But I do not think this there is any fundamental
> difference to discovering allocation functions.
> 
>>  I think you can't
>> insert __builtin_with_size at the access to *p, but in practice
>> that would be very much needed.
> 
> Usually the access to *p would follow directly the
> access x.buf, so BDOS should find it.
> 
> But yes, to get full bounds safety, the pointer type 
> has to change to a variably-modified type (which would work
> today) or a fat pointer type.

By variable-modified type, you mean the VLA?

There is one major difference between VLA and (FAM or Pointer array):

For VLA, the compiler is responsible for allocating the memory for it, 
the size assignment and the memory allocation are both done by the
 compiler at the same time and tied together. 

But for FAM and pointer arrays, right now, users allocate the memory for them
In the source code, so, when we add the “counted_by” attribute, we need to
specify the additional requirement for the order of size assignment and memory
allocation into the source code, and specify this requirement in the user 
documentation.

Later, if we try to make the bound information of FAM/pointer array into TYPE 
system, similar as the current VLA, should we also need to move the memory 
allocation 
of the FAM/pointer arrays into compiler (similar as VLA too)? 
> The later can be built on
> vm-types easily because all the FE semantics already
> exists.

Except the memory allocation part…

Do I miss anything here?

Qing
> 
> Martin
> 
>> 
>> Richard.
>> 
>>> But yes,
>>> for full bounds safety we would need the language feature.
>>> In C people should start to variably-modified types
>>> more.  I think we can build perfect bounds safety on top of
>>> them in a very good way with only FE changes.
>>> 
>>> All these attributes are just a best effort.  But for a while,
>>> this will be necessary.
>>> 
>>> Martin
>>> 
>>>> 
>>>>> Evaluating at this point requires that the size is correctly set
>>>>> before the access to the FAM and the user has to make sure
>>>>> this is the case. But to me this requirement would make sense.
>>>>> 
>>>>> Semantically, it could aöso make sense to evaluate the size at a
>>>>> later time.  But then the reordering becomes problematic again.
>>>>> 
>>>>> Also I think this would make this feature generally more useful.
>>>>> For example, it could work also for others pointers in the struct
>>>>> and not just for FAMs.  In this case, the struct may already be
>>>>> freed when  BDOS is called, so it might also not possible to
>>>>> access the size member at a later time.
>>>>> 
>>>>> Martin
>>>>> 
>>>>> 
>>>>>> 
>>>>> 
>>> 
>

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

Reply via email to