Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-27 Thread Qing Zhao
Okay, thanks for the explanation.
We will keep this in mind.

Qing

> On Oct 27, 2023, at 1:19 PM, Kees Cook  wrote:
> 
> On Fri, Oct 27, 2023 at 03:10:22PM +, Qing Zhao wrote:
>> Since  the dynamic array support is quite important to the kernel (is this 
>> true, Kees? ),
>> We might need to include such support into our design in the beginning. 
> 
> tl;dr: We don't need "dynamic array support" in the 1st version of 
> __counted_by
> 
> I'm not sure it's as strong as "quite important", but it is a code
> pattern that exists. The vast majority of FAM usage is run-time fixed,
> in the sense that the allocation matches the usage. Only sometimes do we
> over-allocate and then slowly fill it up like I've shown.
> 
> So really my thoughts on this are to bring light to the usage pattern
> in the hopes that we don't make it an impossible thing to do. And if
> it's a limitation of the initial version of __counted_by, the kernel can
> still use it: it will just need to use __counted_by strictly for
> allocation sizes, not "usage" size:
> 
> struct foo {
>   int allocated;
>   int used;
>   int array[] __counted_by(allocated); // would nice to use "used"
> };
> 
>   struct foo *p;
> 
>   p = alloc(sizeof(*p) + sizeof(*p->array) * max_items);
>   p->allocated = max_items;
>   p->used = 0;
> 
>   while (data_available())
>   p->array[++p->used] = next_datum();
> 
> With this, we'll still catch p->array accesses beyond "allocated",
> but other code in the kernel won't catch "invalid data" accesses for
> p->array beyond "used". (i.e. we still have memory corruption protection,
> just not logic error protection.)
> 
> We can deal with aliasing in the future if we want to expand to catching
> logic errors.
> 
> I should not that we don't get logic error protection from things like
> ARM's Memory Tagging Extension either -- it only tracks allocation size
> (and is very expensive to change as the "used" part of an allocation
> grows), so this isn't an unreasonable condition for __counted_by to
> require as well.
> 
> -- 
> Kees Cook



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-27 Thread Kees Cook
On Fri, Oct 27, 2023 at 03:10:22PM +, Qing Zhao wrote:
> Since  the dynamic array support is quite important to the kernel (is this 
> true, Kees? ),
> We might need to include such support into our design in the beginning. 

tl;dr: We don't need "dynamic array support" in the 1st version of __counted_by

I'm not sure it's as strong as "quite important", but it is a code
pattern that exists. The vast majority of FAM usage is run-time fixed,
in the sense that the allocation matches the usage. Only sometimes do we
over-allocate and then slowly fill it up like I've shown.

So really my thoughts on this are to bring light to the usage pattern
in the hopes that we don't make it an impossible thing to do. And if
it's a limitation of the initial version of __counted_by, the kernel can
still use it: it will just need to use __counted_by strictly for
allocation sizes, not "usage" size:

struct foo {
int allocated;
int used;
int array[] __counted_by(allocated); // would nice to use "used"
};

struct foo *p;

p = alloc(sizeof(*p) + sizeof(*p->array) * max_items);
p->allocated = max_items;
p->used = 0;

while (data_available())
p->array[++p->used] = next_datum();

With this, we'll still catch p->array accesses beyond "allocated",
but other code in the kernel won't catch "invalid data" accesses for
p->array beyond "used". (i.e. we still have memory corruption protection,
just not logic error protection.)

We can deal with aliasing in the future if we want to expand to catching
logic errors.

I should not that we don't get logic error protection from things like
ARM's Memory Tagging Extension either -- it only tracks allocation size
(and is very expensive to change as the "used" part of an allocation
grows), so this isn't an unreasonable condition for __counted_by to
require as well.

-- 
Kees Cook


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-27 Thread Qing Zhao
About where we should insert the new __builtin_with_access_and_size:

> On Oct 26, 2023, at 2:54 PM, Qing Zhao  wrote:
> 
> 
> 
>> On Oct 26, 2023, at 10:05 AM, Richard Biener  
>> wrote:
>> 
>> 
>> 
>>> Am 26.10.2023 um 12:14 schrieb Martin Uecker :
>>> 
>>> Am Donnerstag, dem 26.10.2023 um 11:20 +0200 schrieb Martin Uecker:
> Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
> On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
>> 
>> Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
>>> 
 Am 25.10.2023 um 12:47 schrieb Martin Uecker :
 
 Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
>> On 2023-10-25 04:16, Martin Uecker wrote:
>> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>>> 
 Am 24.10.2023 um 22:38 schrieb Martin Uecker :
 
 Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
> Hi, Sid,
> 
> Really appreciate for your example and detailed explanation. Very 
> helpful.
> I think that this example is an excellent example to show 
> (almost) all the issues we need to consider.
> 
> I slightly modified this example to make it to be compilable and 
> run-able, as following:
> (but I still cannot make the incorrect reordering or DSE 
> happening, anyway, the potential reordering possibility is there…)
> 
> 1 #include 
> 2 struct A
> 3 {
> 4  size_t size;
> 5  char buf[] __attribute__((counted_by(size)));
> 6 };
> 7
> 8 static size_t
> 9 get_size_from (void *ptr)
> 10 {
> 11  return __builtin_dynamic_object_size (ptr, 1);
> 12 }
> 13
> 14 void
> 15 foo (size_t sz)
> 16 {
> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> sizeof(char));
> 18  obj->size = sz;
> 19  obj->buf[0] = 2;
> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
> 21  return;
> 22 }
> 23
> 24 int main ()
> 25 {
> 26  foo (20);
> 27  return 0;
> 28 }
> 
> 
> 
> 
>>> When it’s set I suppose.  Turn
>>> 
>>> X.l = n;
>>> 
>>> Into
>>> 
>>> X.l = __builtin_with_size (x.buf, n);
>> 
>> It would turn
>> 
>> some_variable = (&) x.buf
>> 
>> into
>> 
>> some_variable = __builtin_with_size ( (&) x.buf. x.len)
>> 
>> 
>> So the later access to x.buf and not the initialization
>> of a member of the struct (which is too early).
>> 
> 
> Hmm, so with Qing's example above, are you suggesting the 
> transformation
> be to foo like so:
> 
> 14 void
> 15 foo (size_t sz)
> 16 {
> 16.5  void * _1;
> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> sizeof(char));
> 18  obj->size = sz;
> 19  obj->buf[0] = 2;
> 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
> 20  __builtin_printf (“%d\n", get_size_from (_1));
> 21  return;
> 22 }
> 
> If yes then this could indeed work.  I think I got thrown off by the
> reference to __bdos.
 
 Yes. I think it is important not to evaluate the size at the
 access to buf and not the allocation, because the point is to
 recover it from the size member even when the compiler can't
 see the original allocation.
>>> 
>>> But if the access is through a pointer without the attribute visible
>>> even the Frontend cannot recover?
>> 
>> Yes, if the access is using a struct-with-FAM without the attribute
>> the FE would not be insert the builtin.  BDOS could potentially
>> still see the original allocation but if it doesn't, then there is
>> no information.
>> 
>>> We’d need to force type correctness and give up on indirecting
>>> through an int * when it can refer to two diffenent container types.
>>> The best we can do I think is mark allocation sites and hope for
>>> some basic code hygiene (not clobbering size or array pointer
>>> through pointers without the appropriately attributed type)
>> 
>> I am do not fully understand what you are referring to.
> 
> struct A { int n; int data[n]; };
> struct B { long n; int data[n]; };
> 
> int *p = flag ? a->data : b->data;
> 
> access *p;
> 
> Since we need to allow 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-27 Thread Qing Zhao


> On Oct 27, 2023, at 10:53 AM, Martin Uecker  wrote:
> 
> Am Freitag, dem 27.10.2023 um 14:32 + schrieb Qing Zhao:
>> 
>>> On Oct 27, 2023, at 3:21 AM, Martin Uecker  wrote:
>>> 
>>> Am Donnerstag, dem 26.10.2023 um 19:57 + schrieb Qing Zhao:
 I guess that what Kees wanted, ""fill the array without knowing the actual 
 final size" code pattern”, as following:
 
>>  struct foo *f;
>>  char *p;
>>  int i;
>> 
>>  f = alloc(maximum_possible);
>>  f->count = 0;
>>  p = f->buf;
>> 
>>  for (i; data_is_available() && i < maximum_possible; i++) {
>>  f->count ++;
>>  p[i] = next_data_item();
>>  }
 
 actually is a dynamic array, or more accurately, Bounded-size dynamic 
 array: ( but not a dynamic allocated array as we discussed so far)
 
 https://en.wikipedia.org/wiki/Dynamic_array
 
 This dynamic array, also is called growable array, or resizable array, 
 whose size can 
 be changed during the lifetime. 
 
 For VLA or FAM, I believe that they are both dynamic allocated array, i.e, 
 even though the size is not know at the compilation time, but the size
 will be fixed after the array is allocated. 
 
 I am not sure whether C has support to such Dynamic array? Or whether it’s 
 easy to provide dynamic array support in C?
>>> 
>>> It is possible to support dynamic arrays in C even with
>>> good checking, but not safely using the pattern above
>>> where you derive a pointer which you later use independently.
>>> 
>>> While we could track the connection to the original struct,
>>> the necessary synchronization between the counter and the
>>> access to the buffer is difficult.  I do not see how this
>>> could be supported with reasonable effort and cost.
>>> 
>>> 
>>> But with this restriction in mind, we can do a lot in C.
>>> For example, see my experimental (!) container library
>>> which has vector type.
>>> https://github.com/uecker/noplate/blob/main/test.c
>>> You can get an array view for the vector (which then
>>> also can decay to a pointer), so it interoperates nicely
>>> with C but you can get good bounds checking.
>>> 
>>> 
>>> But once you derive a pointer and pass it on, it gets
>>> difficult.  But if you want safety, you just have to 
>>> to simply avoid this in code. 
>> 
>> So, for the following modified code: (without the additional pointer “p”)
>> 
>> struct foo
>> {
>> size_t count;
>> char buf[] __attribute__((counted_by(count)));
>> };
>> 
>> struct foo *f;
>> int i;  
>> 
>> f = alloc(maximum_possible);
>> f->count = 0;
>> 
>> for (i; data_is_available() && i < maximum_possible; i++) {
>>  f->count ++;  
>>  f->buf[i] = next_data_item();
>> }   
>> 
>> The support for dynamic array should be possible? 
> 
> With the design we discussed this should work because
> __builtin_with_access (or whatever) it reads:
> 
> f = alloc(maximum_possible);
> f->count = 0;
> 
> for (i; data_is_available() && i < maximum_possible; i++) {
>  f->count ++;  
>  __builtin_with_access(f->buf, f->count)[i] = next_data_item();
> }   
> 

Yes, with the data flow, f->count should get the latest value of f->count. 
>> 
>> 
>>> 
>>> What we could potentially do is add restrictions so 
>>> that the access to buf always has to go via x->buf 
>>> or you get at least a warning.
>> 
>> Are the following two restrictions to the user enough:
>> 
>> 1. The access to buf should always go via x->buf, 
>>no assignment to another independent pointer 
>>and access buf through this new pointer.
> 
> Yes, maybe. One could also try to be smarter.
> 
> For example, one warn only when >buf is
> assigned to another pointer and one of the
> following conditions is fulfilled:
> 
> - the pointer escapes from the local context 
> 
> - there is a store to f->counter in the
> local context that does not dominate >buf.
> 
> Then Kees' example would work too in most cases.

I guess that we might need to come up with the list of concrete restrictions to 
the user, 
and list these restrictions in the user documentation.

Since  the dynamic array support is quite important to the kernel (is this 
true, Kees? ),
We might need to include such support into our design in the beginning. 

> 
> But I would probably wait until we have some
> initial experience with this feature.

You mean after we have an initial implementation of the “builtin_with_size”?
Yes, at this moment, I think that the “builtin_with_size” approach is the best 
one.
Just some details need more thinking before the real implementation.  -:)

Qing
> 
> Martin
> 
>> 2.  User need to keep the synchronization between
>>  the counter and the access to the buffer all the time.



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-27 Thread Martin Uecker
Am Freitag, dem 27.10.2023 um 14:32 + schrieb Qing Zhao:
> 
> > On Oct 27, 2023, at 3:21 AM, Martin Uecker  wrote:
> > 
> > Am Donnerstag, dem 26.10.2023 um 19:57 + schrieb Qing Zhao:
> > > I guess that what Kees wanted, ""fill the array without knowing the 
> > > actual final size" code pattern”, as following:
> > > 
> > > > >   struct foo *f;
> > > > >   char *p;
> > > > >   int i;
> > > > > 
> > > > >   f = alloc(maximum_possible);
> > > > >   f->count = 0;
> > > > >   p = f->buf;
> > > > > 
> > > > >   for (i; data_is_available() && i < maximum_possible; i++) {
> > > > >   f->count ++;
> > > > >   p[i] = next_data_item();
> > > > >   }
> > > 
> > > actually is a dynamic array, or more accurately, Bounded-size dynamic 
> > > array: ( but not a dynamic allocated array as we discussed so far)
> > > 
> > > https://en.wikipedia.org/wiki/Dynamic_array
> > > 
> > > This dynamic array, also is called growable array, or resizable array, 
> > > whose size can 
> > > be changed during the lifetime. 
> > > 
> > > For VLA or FAM, I believe that they are both dynamic allocated array, 
> > > i.e, even though the size is not know at the compilation time, but the 
> > > size
> > > will be fixed after the array is allocated. 
> > > 
> > > I am not sure whether C has support to such Dynamic array? Or whether 
> > > it’s easy to provide dynamic array support in C?
> > 
> > It is possible to support dynamic arrays in C even with
> > good checking, but not safely using the pattern above
> > where you derive a pointer which you later use independently.
> > 
> > While we could track the connection to the original struct,
> > the necessary synchronization between the counter and the
> > access to the buffer is difficult.  I do not see how this
> > could be supported with reasonable effort and cost.
> > 
> > 
> > But with this restriction in mind, we can do a lot in C.
> > For example, see my experimental (!) container library
> > which has vector type.
> > https://github.com/uecker/noplate/blob/main/test.c
> > You can get an array view for the vector (which then
> > also can decay to a pointer), so it interoperates nicely
> > with C but you can get good bounds checking.
> > 
> > 
> > But once you derive a pointer and pass it on, it gets
> > difficult.  But if you want safety, you just have to 
> > to simply avoid this in code. 
> 
> So, for the following modified code: (without the additional pointer “p”)
> 
> struct foo
> {
>  size_t count;
>  char buf[] __attribute__((counted_by(count)));
> };
> 
> struct foo *f;
> int i;  
> 
> f = alloc(maximum_possible);
> f->count = 0;
> 
> for (i; data_is_available() && i < maximum_possible; i++) {
>   f->count ++;  
>   f->buf[i] = next_data_item();
> }   
> 
> The support for dynamic array should be possible? 

With the design we discussed this should work because
__builtin_with_access (or whatever) it reads:

f = alloc(maximum_possible);
f->count = 0;

for (i; data_is_available() && i < maximum_possible; i++) {
  f->count ++;  
  __builtin_with_access(f->buf, f->count)[i] = next_data_item();
}   

> 
> 
> > 
> > What we could potentially do is add restrictions so 
> > that the access to buf always has to go via x->buf 
> > or you get at least a warning.
> 
> Are the following two restrictions to the user enough:
> 
> 1. The access to buf should always go via x->buf, 
> no assignment to another independent pointer 
> and access buf through this new pointer.

Yes, maybe. One could also try to be smarter.

For example, one warn only when >buf is
assigned to another pointer and one of the
following conditions is fulfilled:

- the pointer escapes from the local context 

- there is a store to f->counter in the
local context that does not dominate >buf.

Then Kees' example would work too in most cases.

But I would probably wait until we have some
initial experience with this feature.

Martin

> 2.  User need to keep the synchronization between
>   the counter and the access to the buffer all the time.



> 
> 


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-27 Thread Qing Zhao


> On Oct 27, 2023, at 3:21 AM, Martin Uecker  wrote:
> 
> Am Donnerstag, dem 26.10.2023 um 19:57 + schrieb Qing Zhao:
>> I guess that what Kees wanted, ""fill the array without knowing the actual 
>> final size" code pattern”, as following:
>> 
struct foo *f;
char *p;
int i;
 
f = alloc(maximum_possible);
f->count = 0;
p = f->buf;
 
for (i; data_is_available() && i < maximum_possible; i++) {
f->count ++;
p[i] = next_data_item();
}
>> 
>> actually is a dynamic array, or more accurately, Bounded-size dynamic array: 
>> ( but not a dynamic allocated array as we discussed so far)
>> 
>> https://en.wikipedia.org/wiki/Dynamic_array
>> 
>> This dynamic array, also is called growable array, or resizable array, whose 
>> size can 
>> be changed during the lifetime. 
>> 
>> For VLA or FAM, I believe that they are both dynamic allocated array, i.e, 
>> even though the size is not know at the compilation time, but the size
>> will be fixed after the array is allocated. 
>> 
>> I am not sure whether C has support to such Dynamic array? Or whether it’s 
>> easy to provide dynamic array support in C?
> 
> It is possible to support dynamic arrays in C even with
> good checking, but not safely using the pattern above
> where you derive a pointer which you later use independently.
> 
> While we could track the connection to the original struct,
> the necessary synchronization between the counter and the
> access to the buffer is difficult.  I do not see how this
> could be supported with reasonable effort and cost.
> 
> 
> But with this restriction in mind, we can do a lot in C.
> For example, see my experimental (!) container library
> which has vector type.
> https://github.com/uecker/noplate/blob/main/test.c
> You can get an array view for the vector (which then
> also can decay to a pointer), so it interoperates nicely
> with C but you can get good bounds checking.
> 
> 
> But once you derive a pointer and pass it on, it gets
> difficult.  But if you want safety, you just have to 
> to simply avoid this in code. 

So, for the following modified code: (without the additional pointer “p”)

struct foo
{
 size_t count;
 char buf[] __attribute__((counted_by(count)));
};

struct foo *f;
int i;  

f = alloc(maximum_possible);
f->count = 0;

for (i; data_is_available() && i < maximum_possible; i++) {
  f->count ++;  
  f->buf[i] = next_data_item();
}   

The support for dynamic array should be possible? 


> 
> What we could potentially do is add restrictions so 
> that the access to buf always has to go via x->buf 
> or you get at least a warning.

Are the following two restrictions to the user enough:

1. The access to buf should always go via x->buf, 
no assignment to another independent pointer 
and access buf through this new pointer.
2.  User need to keep the synchronization between
  the counter and the access to the buffer all the time.


Qing
> 
> Martin
> 
> 
> 
> 
>> 
>> Qing
>> 
>> 
>>> On Oct 26, 2023, at 12:45 PM, Martin Uecker  wrote:
>>> 
>>> Am Donnerstag, dem 26.10.2023 um 09:13 -0700 schrieb Kees Cook:
 On Thu, Oct 26, 2023 at 10:15:10AM +0200, Martin Uecker wrote:
> but not this:
> 
>>> 
>>> x->count = 11;
> char *p = >buf;
> x->count = 1;
> p[10] = 1; // !
 
 This seems fine to me -- it's how I'd expect it to work: "10" is beyond
 "1".
>>> 
>>> Note that the store would be allowed.
>>> 
 
> (because the pointer is passed around the
> store to the counter)
> 
> and also here the second store is then irrelevant
> for the access:
> 
> x->count = 10;
> char* p = >buf;
> ...
> x->count = 1; // somewhere else
> 
> p[9] = 1; // ok, because count matter when buf was accesssed.
 
 This is less great, but I can understand why it happens. "p" loses the
 association with "x". It'd be nice if "p" had to way to retain that it
 was just an alias for x->buf, so future p access would check count.
>>> 
>>> The problem is not to discover that p is an alias to x->buf, 
>>> but that it seems difficult to make sure that stores to 
>>> x->count are not reordered relative to the final access to
>>> p[i] you want to check, so that you then get the right value.
>>> 
 
 But this appears to be an existing limitation in other areas where an
 assignment will cause the loss of object association. (I've run into
 this before.) It's just more surprising in the above example because in
 the past the loss of association would cause __bdos() to revert back to
 "SIZE_MAX" results ("I don't know the size") rather than an "outdated"
 size, which may get us into unexpected places...
 
> IMHO this makes sense also from the user side and
> are the desirable semantics we discussed before.
> 
> But can you take a look at this?
> 
> 
> This should simulate it 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-27 Thread Martin Uecker
Am Donnerstag, dem 26.10.2023 um 19:57 + schrieb Qing Zhao:
> I guess that what Kees wanted, ""fill the array without knowing the actual 
> final size" code pattern”, as following:
> 
> > >   struct foo *f;
> > >   char *p;
> > >   int i;
> > > 
> > >   f = alloc(maximum_possible);
> > >   f->count = 0;
> > >   p = f->buf;
> > > 
> > >   for (i; data_is_available() && i < maximum_possible; i++) {
> > >   f->count ++;
> > >   p[i] = next_data_item();
> > >   }
> 
> actually is a dynamic array, or more accurately, Bounded-size dynamic array: 
> ( but not a dynamic allocated array as we discussed so far)
> 
> https://en.wikipedia.org/wiki/Dynamic_array
> 
> This dynamic array, also is called growable array, or resizable array, whose 
> size can 
> be changed during the lifetime. 
> 
> For VLA or FAM, I believe that they are both dynamic allocated array, i.e, 
> even though the size is not know at the compilation time, but the size
> will be fixed after the array is allocated. 
> 
> I am not sure whether C has support to such Dynamic array? Or whether it’s 
> easy to provide dynamic array support in C?

It is possible to support dynamic arrays in C even with
good checking, but not safely using the pattern above
where you derive a pointer which you later use independently.

While we could track the connection to the original struct,
the necessary synchronization between the counter and the
access to the buffer is difficult.  I do not see how this
could be supported with reasonable effort and cost.
 

But with this restriction in mind, we can do a lot in C.
For example, see my experimental (!) container library
which has vector type.
https://github.com/uecker/noplate/blob/main/test.c
You can get an array view for the vector (which then
also can decay to a pointer), so it interoperates nicely
with C but you can get good bounds checking.


But once you derive a pointer and pass it on, it gets
difficult.  But if you want safety, you just have to 
to simply avoid this in code. 

What we could potentially do is add restrictions so 
that the access to buf always has to go via x->buf 
or you get at least a warning.

Martin




> 
> Qing
> 
> 
> > On Oct 26, 2023, at 12:45 PM, Martin Uecker  wrote:
> > 
> > Am Donnerstag, dem 26.10.2023 um 09:13 -0700 schrieb Kees Cook:
> > > On Thu, Oct 26, 2023 at 10:15:10AM +0200, Martin Uecker wrote:
> > > > but not this:
> > > > 
> > 
> > x->count = 11;
> > > > char *p = >buf;
> > > > x->count = 1;
> > > > p[10] = 1; // !
> > > 
> > > This seems fine to me -- it's how I'd expect it to work: "10" is beyond
> > > "1".
> > 
> > Note that the store would be allowed.
> > 
> > > 
> > > > (because the pointer is passed around the
> > > > store to the counter)
> > > > 
> > > > and also here the second store is then irrelevant
> > > > for the access:
> > > > 
> > > > x->count = 10;
> > > > char* p = >buf;
> > > > ...
> > > > x->count = 1; // somewhere else
> > > > 
> > > > p[9] = 1; // ok, because count matter when buf was accesssed.
> > > 
> > > This is less great, but I can understand why it happens. "p" loses the
> > > association with "x". It'd be nice if "p" had to way to retain that it
> > > was just an alias for x->buf, so future p access would check count.
> > 
> > The problem is not to discover that p is an alias to x->buf, 
> > but that it seems difficult to make sure that stores to 
> > x->count are not reordered relative to the final access to
> > p[i] you want to check, so that you then get the right value.
> > 
> > > 
> > > But this appears to be an existing limitation in other areas where an
> > > assignment will cause the loss of object association. (I've run into
> > > this before.) It's just more surprising in the above example because in
> > > the past the loss of association would cause __bdos() to revert back to
> > > "SIZE_MAX" results ("I don't know the size") rather than an "outdated"
> > > size, which may get us into unexpected places...
> > > 
> > > > IMHO this makes sense also from the user side and
> > > > are the desirable semantics we discussed before.
> > > > 
> > > > But can you take a look at this?
> > > > 
> > > > 
> > > > This should simulate it fairly well:
> > > > https://godbolt.org/z/xq89aM7Gr
> > > > 
> > > > (the call to the noinline function would go away,
> > > > but not necessarily its impact on optimization)
> > > 
> > > Yeah, this example should be a very rare situation: a leaf function is
> > > changing the characteristics of the struct but returning a buffer within
> > > it to the caller. The more likely glitch would be from:
> > > 
> > > int main()
> > > {
> > >   struct foo *f = foo_alloc(7);
> > >   char *p = FAM_ACCESS(f, size, buf);
> > > 
> > >   printf("%ld\n", __builtin_dynamic_object_size(p, 0));
> > >   test1(f); // or just "f->count = 10;" no function call needed
> > >   printf("%ld\n", __builtin_dynamic_object_size(p, 0));
> > > 
> > >   return 0;
> > > }
> > > 
> > > which reports:
> > > 7
> > > 7
> > > 
> 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Qing Zhao
I guess that what Kees wanted, ""fill the array without knowing the actual 
final size" code pattern”, as following:

>>  struct foo *f;
>>  char *p;
>>  int i;
>> 
>>  f = alloc(maximum_possible);
>>  f->count = 0;
>>  p = f->buf;
>> 
>>  for (i; data_is_available() && i < maximum_possible; i++) {
>>  f->count ++;
>>  p[i] = next_data_item();
>>  }

actually is a dynamic array, or more accurately, Bounded-size dynamic array: ( 
but not a dynamic allocated array as we discussed so far)

https://en.wikipedia.org/wiki/Dynamic_array

This dynamic array, also is called growable array, or resizable array, whose 
size can 
be changed during the lifetime. 

For VLA or FAM, I believe that they are both dynamic allocated array, i.e, even 
though the size is not know at the compilation time, but the size
will be fixed after the array is allocated. 

I am not sure whether C has support to such Dynamic array? Or whether it’s easy 
to provide dynamic array support in C?

Qing


> On Oct 26, 2023, at 12:45 PM, Martin Uecker  wrote:
> 
> Am Donnerstag, dem 26.10.2023 um 09:13 -0700 schrieb Kees Cook:
>> On Thu, Oct 26, 2023 at 10:15:10AM +0200, Martin Uecker wrote:
>>> but not this:
>>> 
> 
> x->count = 11;
>>> char *p = >buf;
>>> x->count = 1;
>>> p[10] = 1; // !
>> 
>> This seems fine to me -- it's how I'd expect it to work: "10" is beyond
>> "1".
> 
> Note that the store would be allowed.
> 
>> 
>>> (because the pointer is passed around the
>>> store to the counter)
>>> 
>>> and also here the second store is then irrelevant
>>> for the access:
>>> 
>>> x->count = 10;
>>> char* p = >buf;
>>> ...
>>> x->count = 1; // somewhere else
>>> 
>>> p[9] = 1; // ok, because count matter when buf was accesssed.
>> 
>> This is less great, but I can understand why it happens. "p" loses the
>> association with "x". It'd be nice if "p" had to way to retain that it
>> was just an alias for x->buf, so future p access would check count.
> 
> The problem is not to discover that p is an alias to x->buf, 
> but that it seems difficult to make sure that stores to 
> x->count are not reordered relative to the final access to
> p[i] you want to check, so that you then get the right value.
> 
>> 
>> But this appears to be an existing limitation in other areas where an
>> assignment will cause the loss of object association. (I've run into
>> this before.) It's just more surprising in the above example because in
>> the past the loss of association would cause __bdos() to revert back to
>> "SIZE_MAX" results ("I don't know the size") rather than an "outdated"
>> size, which may get us into unexpected places...
>> 
>>> IMHO this makes sense also from the user side and
>>> are the desirable semantics we discussed before.
>>> 
>>> But can you take a look at this?
>>> 
>>> 
>>> This should simulate it fairly well:
>>> https://godbolt.org/z/xq89aM7Gr
>>> 
>>> (the call to the noinline function would go away,
>>> but not necessarily its impact on optimization)
>> 
>> Yeah, this example should be a very rare situation: a leaf function is
>> changing the characteristics of the struct but returning a buffer within
>> it to the caller. The more likely glitch would be from:
>> 
>> int main()
>> {
>>  struct foo *f = foo_alloc(7);
>>  char *p = FAM_ACCESS(f, size, buf);
>> 
>>  printf("%ld\n", __builtin_dynamic_object_size(p, 0));
>>  test1(f); // or just "f->count = 10;" no function call needed
>>  printf("%ld\n", __builtin_dynamic_object_size(p, 0));
>> 
>>  return 0;
>> }
>> 
>> which reports:
>> 7
>> 7
>> 
>> instead of:
>> 7
>> 10
>> 
>> This kind of "get an alias" situation is pretty common in the kernel
>> as a way to have a convenient "handle" to the array. In the case of a
>> "fill the array without knowing the actual final size" code pattern,
>> things would immediately break:
>> 
>>  struct foo *f;
>>  char *p;
>>  int i;
>> 
>>  f = alloc(maximum_possible);
>>  f->count = 0;
>>  p = f->buf;
>> 
>>  for (i; data_is_available() && i < maximum_possible; i++) {
>>  f->count ++;
>>  p[i] = next_data_item();
>>  }
>> 
>> Now perhaps the problem here is that "count" cannot be used for a count
>> of "logically valid members in the array" but must always be a count of
>> "allocated member space in the array", which I guess is tolerable, but
>> isn't ideal -- I'd like to catch logic bugs in addition to allocation
>> bugs, but the latter is certainly much more important to catch.
> 
> Maybe we could have a warning when f->buf is not directly
> accessed.
> 
> Martin
> 
>> 
> 



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Qing Zhao


> On Oct 26, 2023, at 1:05 PM, Martin Uecker  wrote:
> 
> Am Donnerstag, dem 26.10.2023 um 16:41 + schrieb Qing Zhao:
>> 
>>> On Oct 26, 2023, at 5:20 AM, Martin Uecker  wrote:
>>> 
>>> Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
 On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
> 
> Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
>> 
>>> Am 25.10.2023 um 12:47 schrieb Martin Uecker :
>>> 
>>> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
> On 2023-10-25 04:16, Martin Uecker wrote:
> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>> 
>>> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
>>> 
>>> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
 Hi, Sid,
 
 Really appreciate for your example and detailed explanation. Very 
 helpful.
 I think that this example is an excellent example to show (almost) 
 all the issues we need to consider.
 
 I slightly modified this example to make it to be compilable and 
 run-able, as following:
 (but I still cannot make the incorrect reordering or DSE 
 happening, anyway, the potential reordering possibility is there…)
 
 1 #include 
 2 struct A
 3 {
 4  size_t size;
 5  char buf[] __attribute__((counted_by(size)));
 6 };
 7
 8 static size_t
 9 get_size_from (void *ptr)
 10 {
 11  return __builtin_dynamic_object_size (ptr, 1);
 12 }
 13
 14 void
 15 foo (size_t sz)
 16 {
 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
 sizeof(char));
 18  obj->size = sz;
 19  obj->buf[0] = 2;
 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
 21  return;
 22 }
 23
 24 int main ()
 25 {
 26  foo (20);
 27  return 0;
 28 }
 
 
 
 
>> When it’s set I suppose.  Turn
>> 
>> X.l = n;
>> 
>> Into
>> 
>> X.l = __builtin_with_size (x.buf, n);
> 
> It would turn
> 
> some_variable = (&) x.buf
> 
> into
> 
> some_variable = __builtin_with_size ( (&) x.buf. x.len)
> 
> 
> So the later access to x.buf and not the initialization
> of a member of the struct (which is too early).
> 
 
 Hmm, so with Qing's example above, are you suggesting the 
 transformation
 be to foo like so:
 
 14 void
 15 foo (size_t sz)
 16 {
 16.5  void * _1;
 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
 sizeof(char));
 18  obj->size = sz;
 19  obj->buf[0] = 2;
 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
 20  __builtin_printf (“%d\n", get_size_from (_1));
 21  return;
 22 }
 
 If yes then this could indeed work.  I think I got thrown off by the
 reference to __bdos.
>>> 
>>> Yes. I think it is important not to evaluate the size at the
>>> access to buf and not the allocation, because the point is to
>>> recover it from the size member even when the compiler can't
>>> see the original allocation.
>> 
>> But if the access is through a pointer without the attribute visible
>> even the Frontend cannot recover?
> 
> Yes, if the access is using a struct-with-FAM without the attribute
> the FE would not be insert the builtin.  BDOS could potentially
> still see the original allocation but if it doesn't, then there is
> no information.
> 
>> We’d need to force type correctness and give up on indirecting
>> through an int * when it can refer to two diffenent container types.
>> The best we can do I think is mark allocation sites and hope for
>> some basic code hygiene (not clobbering size or array pointer
>> through pointers without the appropriately attributed type)
> 
> I am do not fully understand what you are referring to.
 
 struct A { int n; int data[n]; };
 struct B { long n; int data[n]; };
 
 int *p = flag ? a->data : b->data;
 
 access *p;
 
 Since we need to allow interoperability of pointers (a->data is
 convertible to a non-fat pointer of type int *) this leaves us with
 ambiguity we need to conservatively handle to avoid false positives.
>>> 
>>> For BDOS, I would expect this to work exactly like:
>>> 
>>> char aa[n1];
>>> char 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Qing Zhao


> On Oct 26, 2023, at 10:05 AM, Richard Biener  
> wrote:
> 
> 
> 
>> Am 26.10.2023 um 12:14 schrieb Martin Uecker :
>> 
>> Am Donnerstag, dem 26.10.2023 um 11:20 +0200 schrieb Martin Uecker:
 Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
 On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
> 
> Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
>> 
>>> Am 25.10.2023 um 12:47 schrieb Martin Uecker :
>>> 
>>> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
> On 2023-10-25 04:16, Martin Uecker wrote:
> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>> 
>>> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
>>> 
>>> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
 Hi, Sid,
 
 Really appreciate for your example and detailed explanation. Very 
 helpful.
 I think that this example is an excellent example to show (almost) 
 all the issues we need to consider.
 
 I slightly modified this example to make it to be compilable and 
 run-able, as following:
 (but I still cannot make the incorrect reordering or DSE 
 happening, anyway, the potential reordering possibility is there…)
 
 1 #include 
 2 struct A
 3 {
 4  size_t size;
 5  char buf[] __attribute__((counted_by(size)));
 6 };
 7
 8 static size_t
 9 get_size_from (void *ptr)
 10 {
 11  return __builtin_dynamic_object_size (ptr, 1);
 12 }
 13
 14 void
 15 foo (size_t sz)
 16 {
 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
 sizeof(char));
 18  obj->size = sz;
 19  obj->buf[0] = 2;
 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
 21  return;
 22 }
 23
 24 int main ()
 25 {
 26  foo (20);
 27  return 0;
 28 }
 
 
 
 
>> When it’s set I suppose.  Turn
>> 
>> X.l = n;
>> 
>> Into
>> 
>> X.l = __builtin_with_size (x.buf, n);
> 
> It would turn
> 
> some_variable = (&) x.buf
> 
> into
> 
> some_variable = __builtin_with_size ( (&) x.buf. x.len)
> 
> 
> So the later access to x.buf and not the initialization
> of a member of the struct (which is too early).
> 
 
 Hmm, so with Qing's example above, are you suggesting the 
 transformation
 be to foo like so:
 
 14 void
 15 foo (size_t sz)
 16 {
 16.5  void * _1;
 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
 sizeof(char));
 18  obj->size = sz;
 19  obj->buf[0] = 2;
 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
 20  __builtin_printf (“%d\n", get_size_from (_1));
 21  return;
 22 }
 
 If yes then this could indeed work.  I think I got thrown off by the
 reference to __bdos.
>>> 
>>> Yes. I think it is important not to evaluate the size at the
>>> access to buf and not the allocation, because the point is to
>>> recover it from the size member even when the compiler can't
>>> see the original allocation.
>> 
>> But if the access is through a pointer without the attribute visible
>> even the Frontend cannot recover?
> 
> Yes, if the access is using a struct-with-FAM without the attribute
> the FE would not be insert the builtin.  BDOS could potentially
> still see the original allocation but if it doesn't, then there is
> no information.
> 
>> We’d need to force type correctness and give up on indirecting
>> through an int * when it can refer to two diffenent container types.
>> The best we can do I think is mark allocation sites and hope for
>> some basic code hygiene (not clobbering size or array pointer
>> through pointers without the appropriately attributed type)
> 
> I am do not fully understand what you are referring to.
 
 struct A { int n; int data[n]; };
 struct B { long n; int data[n]; };
 
 int *p = flag ? a->data : b->data;
 
 access *p;
 
 Since we need to allow interoperability of pointers (a->data is
 convertible to a non-fat pointer of type int *) this leaves us with
 ambiguity we need to conservatively handle to avoid false positives.
>>> 
>>> For BDOS, I would expect this to work exactly like:
>>> 
>>> char aa[n1];
>>> 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Richard Biener



> Am 26.10.2023 um 19:05 schrieb Martin Uecker :
> 
> Am Donnerstag, dem 26.10.2023 um 16:41 + schrieb Qing Zhao:
>> 
 On Oct 26, 2023, at 5:20 AM, Martin Uecker  wrote:
>>> 
>>> Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
 On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
> 
> Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
>> 
>>> Am 25.10.2023 um 12:47 schrieb Martin Uecker :
>>> 
>>> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
> On 2023-10-25 04:16, Martin Uecker wrote:
> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>> 
>>> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
>>> 
>>> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
 Hi, Sid,
 
 Really appreciate for your example and detailed explanation. Very 
 helpful.
 I think that this example is an excellent example to show (almost) 
 all the issues we need to consider.
 
 I slightly modified this example to make it to be compilable and 
 run-able, as following:
 (but I still cannot make the incorrect reordering or DSE 
 happening, anyway, the potential reordering possibility is there…)
 
 1 #include 
 2 struct A
 3 {
 4  size_t size;
 5  char buf[] __attribute__((counted_by(size)));
 6 };
 7
 8 static size_t
 9 get_size_from (void *ptr)
 10 {
 11  return __builtin_dynamic_object_size (ptr, 1);
 12 }
 13
 14 void
 15 foo (size_t sz)
 16 {
 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
 sizeof(char));
 18  obj->size = sz;
 19  obj->buf[0] = 2;
 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
 21  return;
 22 }
 23
 24 int main ()
 25 {
 26  foo (20);
 27  return 0;
 28 }
 
 
 
 
>> When it’s set I suppose.  Turn
>> 
>> X.l = n;
>> 
>> Into
>> 
>> X.l = __builtin_with_size (x.buf, n);
> 
> It would turn
> 
> some_variable = (&) x.buf
> 
> into
> 
> some_variable = __builtin_with_size ( (&) x.buf. x.len)
> 
> 
> So the later access to x.buf and not the initialization
> of a member of the struct (which is too early).
> 
 
 Hmm, so with Qing's example above, are you suggesting the 
 transformation
 be to foo like so:
 
 14 void
 15 foo (size_t sz)
 16 {
 16.5  void * _1;
 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
 sizeof(char));
 18  obj->size = sz;
 19  obj->buf[0] = 2;
 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
 20  __builtin_printf (“%d\n", get_size_from (_1));
 21  return;
 22 }
 
 If yes then this could indeed work.  I think I got thrown off by the
 reference to __bdos.
>>> 
>>> Yes. I think it is important not to evaluate the size at the
>>> access to buf and not the allocation, because the point is to
>>> recover it from the size member even when the compiler can't
>>> see the original allocation.
>> 
>> But if the access is through a pointer without the attribute visible
>> even the Frontend cannot recover?
> 
> Yes, if the access is using a struct-with-FAM without the attribute
> the FE would not be insert the builtin.  BDOS could potentially
> still see the original allocation but if it doesn't, then there is
> no information.
> 
>> We’d need to force type correctness and give up on indirecting
>> through an int * when it can refer to two diffenent container types.
>> The best we can do I think is mark allocation sites and hope for
>> some basic code hygiene (not clobbering size or array pointer
>> through pointers without the appropriately attributed type)
> 
> I am do not fully understand what you are referring to.
 
 struct A { int n; int data[n]; };
 struct B { long n; int data[n]; };
 
 int *p = flag ? a->data : b->data;
 
 access *p;
 
 Since we need to allow interoperability of pointers (a->data is
 convertible to a non-fat pointer of type int *) this leaves us with
 ambiguity we need to conservatively handle to avoid false positives.
>>> 
>>> For BDOS, I would expect this to work exactly like:
>>> 
>>> char aa[n1];
>>> char 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Martin Uecker
Am Donnerstag, dem 26.10.2023 um 16:41 + schrieb Qing Zhao:
> 
> > On Oct 26, 2023, at 5:20 AM, Martin Uecker  wrote:
> > 
> > Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
> > > On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
> > > > 
> > > > Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
> > > > > 
> > > > > > Am 25.10.2023 um 12:47 schrieb Martin Uecker :
> > > > > > 
> > > > > > Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh 
> > > > > > Poyarekar:
> > > > > > > > On 2023-10-25 04:16, Martin Uecker wrote:
> > > > > > > > Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard 
> > > > > > > > Biener:
> > > > > > > > > 
> > > > > > > > > > Am 24.10.2023 um 22:38 schrieb Martin Uecker 
> > > > > > > > > > :
> > > > > > > > > > 
> > > > > > > > > > Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing 
> > > > > > > > > > Zhao:
> > > > > > > > > > > Hi, Sid,
> > > > > > > > > > > 
> > > > > > > > > > > Really appreciate for your example and detailed 
> > > > > > > > > > > explanation. Very helpful.
> > > > > > > > > > > I think that this example is an excellent example to show 
> > > > > > > > > > > (almost) all the issues we need to consider.
> > > > > > > > > > > 
> > > > > > > > > > > I slightly modified this example to make it to be 
> > > > > > > > > > > compilable and run-able, as following:
> > > > > > > > > > > (but I still cannot make the incorrect reordering or DSE 
> > > > > > > > > > > happening, anyway, the potential reordering possibility 
> > > > > > > > > > > is there…)
> > > > > > > > > > > 
> > > > > > > > > > > 1 #include 
> > > > > > > > > > > 2 struct A
> > > > > > > > > > > 3 {
> > > > > > > > > > > 4  size_t size;
> > > > > > > > > > > 5  char buf[] __attribute__((counted_by(size)));
> > > > > > > > > > > 6 };
> > > > > > > > > > > 7
> > > > > > > > > > > 8 static size_t
> > > > > > > > > > > 9 get_size_from (void *ptr)
> > > > > > > > > > > 10 {
> > > > > > > > > > > 11  return __builtin_dynamic_object_size (ptr, 1);
> > > > > > > > > > > 12 }
> > > > > > > > > > > 13
> > > > > > > > > > > 14 void
> > > > > > > > > > > 15 foo (size_t sz)
> > > > > > > > > > > 16 {
> > > > > > > > > > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + 
> > > > > > > > > > > sz * sizeof(char));
> > > > > > > > > > > 18  obj->size = sz;
> > > > > > > > > > > 19  obj->buf[0] = 2;
> > > > > > > > > > > 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
> > > > > > > > > > > 21  return;
> > > > > > > > > > > 22 }
> > > > > > > > > > > 23
> > > > > > > > > > > 24 int main ()
> > > > > > > > > > > 25 {
> > > > > > > > > > > 26  foo (20);
> > > > > > > > > > > 27  return 0;
> > > > > > > > > > > 28 }
> > > > > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > > > When it’s set I suppose.  Turn
> > > > > > > > > 
> > > > > > > > > X.l = n;
> > > > > > > > > 
> > > > > > > > > Into
> > > > > > > > > 
> > > > > > > > > X.l = __builtin_with_size (x.buf, n);
> > > > > > > > 
> > > > > > > > It would turn
> > > > > > > > 
> > > > > > > > some_variable = (&) x.buf
> > > > > > > > 
> > > > > > > > into
> > > > > > > > 
> > > > > > > > some_variable = __builtin_with_size ( (&) x.buf. x.len)
> > > > > > > > 
> > > > > > > > 
> > > > > > > > So the later access to x.buf and not the initialization
> > > > > > > > of a member of the struct (which is too early).
> > > > > > > > 
> > > > > > > 
> > > > > > > Hmm, so with Qing's example above, are you suggesting the 
> > > > > > > transformation
> > > > > > > be to foo like so:
> > > > > > > 
> > > > > > > 14 void
> > > > > > > 15 foo (size_t sz)
> > > > > > > 16 {
> > > > > > > 16.5  void * _1;
> > > > > > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> > > > > > > sizeof(char));
> > > > > > > 18  obj->size = sz;
> > > > > > > 19  obj->buf[0] = 2;
> > > > > > > 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
> > > > > > > 20  __builtin_printf (“%d\n", get_size_from (_1));
> > > > > > > 21  return;
> > > > > > > 22 }
> > > > > > > 
> > > > > > > If yes then this could indeed work.  I think I got thrown off by 
> > > > > > > the
> > > > > > > reference to __bdos.
> > > > > > 
> > > > > > Yes. I think it is important not to evaluate the size at the
> > > > > > access to buf and not the allocation, because the point is to
> > > > > > recover it from the size member even when the compiler can't
> > > > > > see the original allocation.
> > > > > 
> > > > > But if the access is through a pointer without the attribute visible
> > > > > even the Frontend cannot recover?
> > > > 
> > > > Yes, if the access is using a struct-with-FAM without the attribute
> > > > the FE would not be insert the builtin.  BDOS could potentially
> > > > still see the original allocation but if it doesn't, then there is
> > > > no information.
> > > > 
> > > > > We’d need to force type correctness and give up on indirecting
> > > > > through an int * when it can refer to 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Martin Uecker
Am Donnerstag, dem 26.10.2023 um 09:13 -0700 schrieb Kees Cook:
> On Thu, Oct 26, 2023 at 10:15:10AM +0200, Martin Uecker wrote:
> > but not this:
> > 

x->count = 11;
> > char *p = >buf;
> > x->count = 1;
> > p[10] = 1; // !
> 
> This seems fine to me -- it's how I'd expect it to work: "10" is beyond
> "1".

Note that the store would be allowed.

> 
> > (because the pointer is passed around the
> > store to the counter)
> > 
> > and also here the second store is then irrelevant
> > for the access:
> > 
> > x->count = 10;
> > char* p = >buf;
> > ...
> > x->count = 1; // somewhere else
> > 
> > p[9] = 1; // ok, because count matter when buf was accesssed.
> 
> This is less great, but I can understand why it happens. "p" loses the
> association with "x". It'd be nice if "p" had to way to retain that it
> was just an alias for x->buf, so future p access would check count.

The problem is not to discover that p is an alias to x->buf, 
but that it seems difficult to make sure that stores to 
x->count are not reordered relative to the final access to
p[i] you want to check, so that you then get the right value.

> 
> But this appears to be an existing limitation in other areas where an
> assignment will cause the loss of object association. (I've run into
> this before.) It's just more surprising in the above example because in
> the past the loss of association would cause __bdos() to revert back to
> "SIZE_MAX" results ("I don't know the size") rather than an "outdated"
> size, which may get us into unexpected places...
> 
> > IMHO this makes sense also from the user side and
> > are the desirable semantics we discussed before.
> > 
> > But can you take a look at this?
> > 
> > 
> > This should simulate it fairly well:
> > https://godbolt.org/z/xq89aM7Gr
> > 
> > (the call to the noinline function would go away,
> > but not necessarily its impact on optimization)
> 
> Yeah, this example should be a very rare situation: a leaf function is
> changing the characteristics of the struct but returning a buffer within
> it to the caller. The more likely glitch would be from:
> 
> int main()
> {
>   struct foo *f = foo_alloc(7);
>   char *p = FAM_ACCESS(f, size, buf);
> 
>   printf("%ld\n", __builtin_dynamic_object_size(p, 0));
>   test1(f); // or just "f->count = 10;" no function call needed
>   printf("%ld\n", __builtin_dynamic_object_size(p, 0));
> 
>   return 0;
> }
> 
> which reports:
> 7
> 7
> 
> instead of:
> 7
> 10
> 
> This kind of "get an alias" situation is pretty common in the kernel
> as a way to have a convenient "handle" to the array. In the case of a
> "fill the array without knowing the actual final size" code pattern,
> things would immediately break:
> 
>   struct foo *f;
>   char *p;
>   int i;
> 
>   f = alloc(maximum_possible);
>   f->count = 0;
>   p = f->buf;
> 
>   for (i; data_is_available() && i < maximum_possible; i++) {
>   f->count ++;
>   p[i] = next_data_item();
>   }
> 
> Now perhaps the problem here is that "count" cannot be used for a count
> of "logically valid members in the array" but must always be a count of
> "allocated member space in the array", which I guess is tolerable, but
> isn't ideal -- I'd like to catch logic bugs in addition to allocation
> bugs, but the latter is certainly much more important to catch.

Maybe we could have a warning when f->buf is not directly
accessed.

Martin

> 



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Qing Zhao


> On Oct 26, 2023, at 5:20 AM, Martin Uecker  wrote:
> 
> Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
>> On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
>>> 
>>> Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
 
> Am 25.10.2023 um 12:47 schrieb Martin Uecker :
> 
> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
>>> On 2023-10-25 04:16, Martin Uecker wrote:
>>> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
 
> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
> 
> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>> Hi, Sid,
>> 
>> Really appreciate for your example and detailed explanation. Very 
>> helpful.
>> I think that this example is an excellent example to show (almost) 
>> all the issues we need to consider.
>> 
>> I slightly modified this example to make it to be compilable and 
>> run-able, as following:
>> (but I still cannot make the incorrect reordering or DSE happening, 
>> anyway, the potential reordering possibility is there…)
>> 
>> 1 #include 
>> 2 struct A
>> 3 {
>> 4  size_t size;
>> 5  char buf[] __attribute__((counted_by(size)));
>> 6 };
>> 7
>> 8 static size_t
>> 9 get_size_from (void *ptr)
>> 10 {
>> 11  return __builtin_dynamic_object_size (ptr, 1);
>> 12 }
>> 13
>> 14 void
>> 15 foo (size_t sz)
>> 16 {
>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>> sizeof(char));
>> 18  obj->size = sz;
>> 19  obj->buf[0] = 2;
>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>> 21  return;
>> 22 }
>> 23
>> 24 int main ()
>> 25 {
>> 26  foo (20);
>> 27  return 0;
>> 28 }
>> 
>> 
>> 
>> 
 When it’s set I suppose.  Turn
 
 X.l = n;
 
 Into
 
 X.l = __builtin_with_size (x.buf, n);
>>> 
>>> It would turn
>>> 
>>> some_variable = (&) x.buf
>>> 
>>> into
>>> 
>>> some_variable = __builtin_with_size ( (&) x.buf. x.len)
>>> 
>>> 
>>> So the later access to x.buf and not the initialization
>>> of a member of the struct (which is too early).
>>> 
>> 
>> Hmm, so with Qing's example above, are you suggesting the transformation
>> be to foo like so:
>> 
>> 14 void
>> 15 foo (size_t sz)
>> 16 {
>> 16.5  void * _1;
>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>> sizeof(char));
>> 18  obj->size = sz;
>> 19  obj->buf[0] = 2;
>> 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
>> 20  __builtin_printf (“%d\n", get_size_from (_1));
>> 21  return;
>> 22 }
>> 
>> If yes then this could indeed work.  I think I got thrown off by the
>> reference to __bdos.
> 
> Yes. I think it is important not to evaluate the size at the
> access to buf and not the allocation, because the point is to
> recover it from the size member even when the compiler can't
> see the original allocation.
 
 But if the access is through a pointer without the attribute visible
 even the Frontend cannot recover?
>>> 
>>> Yes, if the access is using a struct-with-FAM without the attribute
>>> the FE would not be insert the builtin.  BDOS could potentially
>>> still see the original allocation but if it doesn't, then there is
>>> no information.
>>> 
 We’d need to force type correctness and give up on indirecting
 through an int * when it can refer to two diffenent container types.
 The best we can do I think is mark allocation sites and hope for
 some basic code hygiene (not clobbering size or array pointer
 through pointers without the appropriately attributed type)
>>> 
>>> I am do not fully understand what you are referring to.
>> 
>> struct A { int n; int data[n]; };
>> struct B { long n; int data[n]; };
>> 
>> int *p = flag ? a->data : b->data;
>> 
>> access *p;
>> 
>> Since we need to allow interoperability of pointers (a->data is
>> convertible to a non-fat pointer of type int *) this leaves us with
>> ambiguity we need to conservatively handle to avoid false positives.
> 
> For BDOS, I would expect this to work exactly like:
> 
> char aa[n1];
> char bb[n2];
> char *p = flag ? aa : bb;
> 
> (or similar code with malloc). In fact it does:
> 
> https://godbolt.org/z/bK68YKqhe
> (cheating a bit and also the sub-object version of
> BDOS does not seem to work)
> 
>> 
>> We _might_ want to diagnose decay of a->data to int *, but IIRC
>> there's no way (or proposal) to allow declaring a corresponding
>> fat pointer, so it's not a good designed feature.
> 
> As a 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Martin Uecker
Am Donnerstag, dem 26.10.2023 um 17:48 +0200 schrieb Richard Biener:
> 
> > Am 26.10.2023 um 16:58 schrieb Qing Zhao :
> > 
> > 
> > 
> > > On Oct 26, 2023, at 4:56 AM, Richard Biener  
> > > wrote:
> > > 
> > > > On Thu, Oct 26, 2023 at 7:22 AM Jakub Jelinek  wrote:
> > > > 
> > > > On Wed, Oct 25, 2023 at 07:03:43PM +, Qing Zhao wrote:
> > > > > For the code generation impact:
> > > > > 
> > > > > turning the original  x.buf
> > > > > to a builtin function call
> > > > > __builtin_with_access_and_size(x,buf, x.L,-1)
> > > > > 
> > > > > might inhibit some optimizations from happening before the builtin is
> > > > > evaluated into object size info (phase  .objsz1).  I guess there 
> > > > > might be
> > > > > some performance impact.
> > > > > 
> > > > > However, if we mark this builtin as PURE, NOTRROW, etc, then the 
> > > > > negative
> > > > > performance impact will be reduced to minimum?
> > > > 
> > > > You can't drop it during objsz1 pass though, otherwise __bdos wouldn't
> > > > be able to figure out the dynamic sizes in case of normal (non-early)
> > > > inlining - caller takes address of a counted_by array, passes it down
> > > > to callee which is only inlined late and uses __bdos, or callee takes 
> > > > address
> > > > and returns it and caller uses __bdos, etc. - so it would need to be 
> > > > objsz2.
> > > > 
> > > > And while the builtin (or if it is an internal detail rather than user
> > > > accessible builtin an internal function) could be even 
> > > > const/nothrow/leaf if
> > > > the arguments contain the loads from the structure 2 fields, I'm afraid 
> > > > it
> > > > will still have huge code generation impact, prevent tons of pre-IPA
> > > > optimizations.  And it will need some work to handle it properly during
> > > > inlining heuristics, because in GIMPLE the COMPONENT_REF loads aren't 
> > > > gimple
> > > > values, so it wouldn't be just the builtin/internal-fn call to be 
> > > > ignored,
> > > > but also the count load from memory.
> > > 
> > > I think we want to track the value, not the "memory" in the builtin call,
> > > so GIMPLE would be
> > > 
> > > _1 = x.L;
> > > .. = __builtin_with_access_and_size (, _1, -1);
> > 
> > Before adding the __builtin_with_access_and_size, the code is:
> > 
> > 
> > 
> > After inserting the built-in, it becomes:
> > 
> > _1 = x.L;
> > __builtin_with_access_and_size (, _1, -1).
> > 
> > 
> > So, the # of total instructions, the # of LOADs, and the # of calls will 
> > all be increased.
> > There will be impact to the inlining decision definitely.
> 
> Note we have to make sure, if x is a pointer and we want to instrument 
> >buf that we
> Can dereference x.  Possibly doing
> 
> _1 = x ? x->Len : -1;
> 
> I’m not sure the C standard makes accessing x->Len unconditionally not 
> undefined behavior when >buf is computed.  Definitely it’s a violation of 
> the abstract machine of Len is volatile qualified (but we can reject such 
> counted_by or instantiations as volatile qualified types).

I believe it is implicit UB to do >buf if there is
no object *x because the wording assumes the existence
of an object.  In that case accessing x->L should
be fine too.  

In practice the access may trap  for other reasons 
(mprotect etc.),  but I guess this is acceptable,
but should probably be documented...

We might need the x?  to not run into trouble with
those offsetof  implementations written using null
pointer.  Although in this case maybe one could
hope that the load will get optimized anyway ...

Martin

> 
> Richard 
> 
> > 
> > > 
> > > also please make sure to use an internal function for
> > > __builtin_with_access_and_size,
> > > I don't think we want to expose this to users - it's an implementation 
> > > detail.
> > 
> > Okay, will define it as an internal function (add it to internal-fn.def). 
> > -:)
> > 
> > Qing
> > > 
> > > Richard.
> > > 
> > > > 
> > > >   Jakub
> > > > 
> > 



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Kees Cook
On Thu, Oct 26, 2023 at 10:15:10AM +0200, Martin Uecker wrote:
> but not this:
> 
> char *p = >buf;
> x->count = 1;
> p[10] = 1; // !

This seems fine to me -- it's how I'd expect it to work: "10" is beyond
"1".

> (because the pointer is passed around the
> store to the counter)
> 
> and also here the second store is then irrelevant
> for the access:
> 
> x->count = 10;
> char* p = >buf;
> ...
> x->count = 1; // somewhere else
> 
> p[9] = 1; // ok, because count matter when buf was accesssed.

This is less great, but I can understand why it happens. "p" loses the
association with "x". It'd be nice if "p" had to way to retain that it
was just an alias for x->buf, so future p access would check count.

But this appears to be an existing limitation in other areas where an
assignment will cause the loss of object association. (I've run into
this before.) It's just more surprising in the above example because in
the past the loss of association would cause __bdos() to revert back to
"SIZE_MAX" results ("I don't know the size") rather than an "outdated"
size, which may get us into unexpected places...

> IMHO this makes sense also from the user side and
> are the desirable semantics we discussed before.
> 
> But can you take a look at this?
> 
> 
> This should simulate it fairly well:
> https://godbolt.org/z/xq89aM7Gr
> 
> (the call to the noinline function would go away,
> but not necessarily its impact on optimization)

Yeah, this example should be a very rare situation: a leaf function is
changing the characteristics of the struct but returning a buffer within
it to the caller. The more likely glitch would be from:

int main()
{
struct foo *f = foo_alloc(7);
char *p = FAM_ACCESS(f, size, buf);

printf("%ld\n", __builtin_dynamic_object_size(p, 0));
test1(f); // or just "f->count = 10;" no function call needed
printf("%ld\n", __builtin_dynamic_object_size(p, 0));

return 0;
}

which reports:
7
7

instead of:
7
10

This kind of "get an alias" situation is pretty common in the kernel
as a way to have a convenient "handle" to the array. In the case of a
"fill the array without knowing the actual final size" code pattern,
things would immediately break:

struct foo *f;
char *p;
int i;

f = alloc(maximum_possible);
f->count = 0;
p = f->buf;

for (i; data_is_available() && i < maximum_possible; i++) {
f->count ++;
p[i] = next_data_item();
}

Now perhaps the problem here is that "count" cannot be used for a count
of "logically valid members in the array" but must always be a count of
"allocated member space in the array", which I guess is tolerable, but
isn't ideal -- I'd like to catch logic bugs in addition to allocation
bugs, but the latter is certainly much more important to catch.

-- 
Kees Cook


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Richard Biener



> Am 26.10.2023 um 16:58 schrieb Qing Zhao :
> 
> 
> 
>> On Oct 26, 2023, at 4:56 AM, Richard Biener  
>> wrote:
>> 
>>> On Thu, Oct 26, 2023 at 7:22 AM Jakub Jelinek  wrote:
>>> 
>>> On Wed, Oct 25, 2023 at 07:03:43PM +, Qing Zhao wrote:
 For the code generation impact:
 
 turning the original  x.buf
 to a builtin function call
 __builtin_with_access_and_size(x,buf, x.L,-1)
 
 might inhibit some optimizations from happening before the builtin is
 evaluated into object size info (phase  .objsz1).  I guess there might be
 some performance impact.
 
 However, if we mark this builtin as PURE, NOTRROW, etc, then the negative
 performance impact will be reduced to minimum?
>>> 
>>> You can't drop it during objsz1 pass though, otherwise __bdos wouldn't
>>> be able to figure out the dynamic sizes in case of normal (non-early)
>>> inlining - caller takes address of a counted_by array, passes it down
>>> to callee which is only inlined late and uses __bdos, or callee takes 
>>> address
>>> and returns it and caller uses __bdos, etc. - so it would need to be objsz2.
>>> 
>>> And while the builtin (or if it is an internal detail rather than user
>>> accessible builtin an internal function) could be even const/nothrow/leaf if
>>> the arguments contain the loads from the structure 2 fields, I'm afraid it
>>> will still have huge code generation impact, prevent tons of pre-IPA
>>> optimizations.  And it will need some work to handle it properly during
>>> inlining heuristics, because in GIMPLE the COMPONENT_REF loads aren't gimple
>>> values, so it wouldn't be just the builtin/internal-fn call to be ignored,
>>> but also the count load from memory.
>> 
>> I think we want to track the value, not the "memory" in the builtin call,
>> so GIMPLE would be
>> 
>> _1 = x.L;
>> .. = __builtin_with_access_and_size (, _1, -1);
> 
> Before adding the __builtin_with_access_and_size, the code is:
> 
> 
> 
> After inserting the built-in, it becomes:
> 
> _1 = x.L;
> __builtin_with_access_and_size (, _1, -1).
> 
> 
> So, the # of total instructions, the # of LOADs, and the # of calls will all 
> be increased.
> There will be impact to the inlining decision definitely.

Note we have to make sure, if x is a pointer and we want to instrument >buf 
that we
Can dereference x.  Possibly doing

_1 = x ? x->Len : -1;

I’m not sure the C standard makes accessing x->Len unconditionally not 
undefined behavior when >buf is computed.  Definitely it’s a violation of 
the abstract machine of Len is volatile qualified (but we can reject such 
counted_by or instantiations as volatile qualified types).

Richard 

> 
>> 
>> also please make sure to use an internal function for
>> __builtin_with_access_and_size,
>> I don't think we want to expose this to users - it's an implementation 
>> detail.
> 
> Okay, will define it as an internal function (add it to internal-fn.def). -:)
> 
> Qing
>> 
>> Richard.
>> 
>>> 
>>>   Jakub
>>> 
> 


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Qing Zhao


> On Oct 26, 2023, at 4:56 AM, Richard Biener  
> wrote:
> 
> On Thu, Oct 26, 2023 at 7:22 AM Jakub Jelinek  wrote:
>> 
>> On Wed, Oct 25, 2023 at 07:03:43PM +, Qing Zhao wrote:
>>> For the code generation impact:
>>> 
>>> turning the original  x.buf
>>> to a builtin function call
>>> __builtin_with_access_and_size(x,buf, x.L,-1)
>>> 
>>> might inhibit some optimizations from happening before the builtin is
>>> evaluated into object size info (phase  .objsz1).  I guess there might be
>>> some performance impact.
>>> 
>>> However, if we mark this builtin as PURE, NOTRROW, etc, then the negative
>>> performance impact will be reduced to minimum?
>> 
>> You can't drop it during objsz1 pass though, otherwise __bdos wouldn't
>> be able to figure out the dynamic sizes in case of normal (non-early)
>> inlining - caller takes address of a counted_by array, passes it down
>> to callee which is only inlined late and uses __bdos, or callee takes address
>> and returns it and caller uses __bdos, etc. - so it would need to be objsz2.
>> 
>> And while the builtin (or if it is an internal detail rather than user
>> accessible builtin an internal function) could be even const/nothrow/leaf if
>> the arguments contain the loads from the structure 2 fields, I'm afraid it
>> will still have huge code generation impact, prevent tons of pre-IPA
>> optimizations.  And it will need some work to handle it properly during
>> inlining heuristics, because in GIMPLE the COMPONENT_REF loads aren't gimple
>> values, so it wouldn't be just the builtin/internal-fn call to be ignored,
>> but also the count load from memory.
> 
> I think we want to track the value, not the "memory" in the builtin call,
> so GIMPLE would be
> 
> _1 = x.L;
> .. = __builtin_with_access_and_size (, _1, -1);

Before adding the __builtin_with_access_and_size, the code is:



After inserting the built-in, it becomes:

_1 = x.L;
__builtin_with_access_and_size (, _1, -1).


So, the # of total instructions, the # of LOADs, and the # of calls will all be 
increased.
There will be impact to the inlining decision definitely.

> 
> also please make sure to use an internal function for
> __builtin_with_access_and_size,
> I don't think we want to expose this to users - it's an implementation detail.

Okay, will define it as an internal function (add it to internal-fn.def). -:)

Qing
> 
> Richard.
> 
>> 
>>Jakub
>> 



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Qing Zhao


> On Oct 26, 2023, at 1:21 AM, Jakub Jelinek  wrote:
> 
> On Wed, Oct 25, 2023 at 07:03:43PM +, Qing Zhao wrote:
>> For the code generation impact:
>> 
>> turning the original  x.buf 
>> to a builtin function call
>> __builtin_with_access_and_size(x,buf, x.L,-1)
>> 
>> might inhibit some optimizations from happening before the builtin is
>> evaluated into object size info (phase  .objsz1).  I guess there might be
>> some performance impact.
>> 
>> However, if we mark this builtin as PURE, NOTRROW, etc, then the negative
>> performance impact will be reduced to minimum?
> 
> You can't drop it during objsz1 pass though, otherwise __bdos wouldn't
> be able to figure out the dynamic sizes in case of normal (non-early)
> inlining - caller takes address of a counted_by array, passes it down
> to callee which is only inlined late and uses __bdos, or callee takes address
> and returns it and caller uses __bdos, etc. - so it would need to be objsz2.

I guess that I didn’t say it very clear previously. Let me explain again:

My understanding is, there are “early_objsz” phase and then later “objsz1” 
phase for -O[1|2|3]. 
For -Og, there are “early_objsz” and then later “objsz2”. 

So, the “objsz1” I mentioned (for the case -O[1|2|3])  should be the same as 
the “objsz2” you mentioned above?  -:)
It’s the second objsz phase. 

In the second objsz phase, I believe that all the inlining (including early 
inlining and IPA inlining) are all applied?
> 
> And while the builtin (or if it is an internal detail rather than user
> accessible builtin an internal function)

Okay, will use an “internal function” instead of “ builtin function”. 

> could be even const/nothrow/leaf if
> the arguments contain the loads from the structure 2 fields, I'm afraid it
> will still have huge code generation impact, prevent tons of pre-IPA
> optimizations.  And it will need some work to handle it properly during
> inlining heuristics, because in GIMPLE the COMPONENT_REF loads aren't gimple
> values, so it wouldn't be just the builtin/internal-fn call to be ignored,
> but also the count load from memory.

Are you worrying about the potential additional LOADs will change the inlining 
decision
 since the inlining heuristic depends on the # of loads from memory? 

In additional to the # of loads, the # of instructions and the # of calls of 
the function 
might be increased too, will these have impact on inlining decision? 

In addition to inlining decision, any other impact to other IPA optimizations? 

thanks.

Qing


> 
>   Jakub
> 



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Richard Biener



> Am 26.10.2023 um 12:14 schrieb Martin Uecker :
> 
> Am Donnerstag, dem 26.10.2023 um 11:20 +0200 schrieb Martin Uecker:
>>> Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
>>> On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
 
 Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
> 
>> Am 25.10.2023 um 12:47 schrieb Martin Uecker :
>> 
>> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
 On 2023-10-25 04:16, Martin Uecker wrote:
 Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
> 
>> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
>> 
>> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>>> Hi, Sid,
>>> 
>>> Really appreciate for your example and detailed explanation. Very 
>>> helpful.
>>> I think that this example is an excellent example to show (almost) 
>>> all the issues we need to consider.
>>> 
>>> I slightly modified this example to make it to be compilable and 
>>> run-able, as following:
>>> (but I still cannot make the incorrect reordering or DSE happening, 
>>> anyway, the potential reordering possibility is there…)
>>> 
>>> 1 #include 
>>> 2 struct A
>>> 3 {
>>> 4  size_t size;
>>> 5  char buf[] __attribute__((counted_by(size)));
>>> 6 };
>>> 7
>>> 8 static size_t
>>> 9 get_size_from (void *ptr)
>>> 10 {
>>> 11  return __builtin_dynamic_object_size (ptr, 1);
>>> 12 }
>>> 13
>>> 14 void
>>> 15 foo (size_t sz)
>>> 16 {
>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>>> sizeof(char));
>>> 18  obj->size = sz;
>>> 19  obj->buf[0] = 2;
>>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>>> 21  return;
>>> 22 }
>>> 23
>>> 24 int main ()
>>> 25 {
>>> 26  foo (20);
>>> 27  return 0;
>>> 28 }
>>> 
>>> 
>>> 
>>> 
> When it’s set I suppose.  Turn
> 
> X.l = n;
> 
> Into
> 
> X.l = __builtin_with_size (x.buf, n);
 
 It would turn
 
 some_variable = (&) x.buf
 
 into
 
 some_variable = __builtin_with_size ( (&) x.buf. x.len)
 
 
 So the later access to x.buf and not the initialization
 of a member of the struct (which is too early).
 
>>> 
>>> Hmm, so with Qing's example above, are you suggesting the transformation
>>> be to foo like so:
>>> 
>>> 14 void
>>> 15 foo (size_t sz)
>>> 16 {
>>> 16.5  void * _1;
>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>>> sizeof(char));
>>> 18  obj->size = sz;
>>> 19  obj->buf[0] = 2;
>>> 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
>>> 20  __builtin_printf (“%d\n", get_size_from (_1));
>>> 21  return;
>>> 22 }
>>> 
>>> If yes then this could indeed work.  I think I got thrown off by the
>>> reference to __bdos.
>> 
>> Yes. I think it is important not to evaluate the size at the
>> access to buf and not the allocation, because the point is to
>> recover it from the size member even when the compiler can't
>> see the original allocation.
> 
> But if the access is through a pointer without the attribute visible
> even the Frontend cannot recover?
 
 Yes, if the access is using a struct-with-FAM without the attribute
 the FE would not be insert the builtin.  BDOS could potentially
 still see the original allocation but if it doesn't, then there is
 no information.
 
> We’d need to force type correctness and give up on indirecting
> through an int * when it can refer to two diffenent container types.
> The best we can do I think is mark allocation sites and hope for
> some basic code hygiene (not clobbering size or array pointer
> through pointers without the appropriately attributed type)
 
 I am do not fully understand what you are referring to.
>>> 
>>> struct A { int n; int data[n]; };
>>> struct B { long n; int data[n]; };
>>> 
>>> int *p = flag ? a->data : b->data;
>>> 
>>> access *p;
>>> 
>>> Since we need to allow interoperability of pointers (a->data is
>>> convertible to a non-fat pointer of type int *) this leaves us with
>>> ambiguity we need to conservatively handle to avoid false positives.
>> 
>> For BDOS, I would expect this to work exactly like:
>> 
>> char aa[n1];
>> char bb[n2];
>> char *p = flag ? aa : bb;
>> 
>> (or similar code with malloc). In fact it does:
>> 
>> https://godbolt.org/z/bK68YKqhe
>> (cheating a bit and also the sub-object version of
>> BDOS does not seem to 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Martin Uecker
Am Donnerstag, dem 26.10.2023 um 11:20 +0200 schrieb Martin Uecker:
> Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
> > On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
> > > 
> > > Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
> > > > 
> > > > > Am 25.10.2023 um 12:47 schrieb Martin Uecker :
> > > > > 
> > > > > Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh 
> > > > > Poyarekar:
> > > > > > > On 2023-10-25 04:16, Martin Uecker wrote:
> > > > > > > Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
> > > > > > > > 
> > > > > > > > > Am 24.10.2023 um 22:38 schrieb Martin Uecker 
> > > > > > > > > :
> > > > > > > > > 
> > > > > > > > > Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
> > > > > > > > > > Hi, Sid,
> > > > > > > > > > 
> > > > > > > > > > Really appreciate for your example and detailed 
> > > > > > > > > > explanation. Very helpful.
> > > > > > > > > > I think that this example is an excellent example to show 
> > > > > > > > > > (almost) all the issues we need to consider.
> > > > > > > > > > 
> > > > > > > > > > I slightly modified this example to make it to be 
> > > > > > > > > > compilable and run-able, as following:
> > > > > > > > > > (but I still cannot make the incorrect reordering or DSE 
> > > > > > > > > > happening, anyway, the potential reordering possibility is 
> > > > > > > > > > there…)
> > > > > > > > > > 
> > > > > > > > > >  1 #include 
> > > > > > > > > >  2 struct A
> > > > > > > > > >  3 {
> > > > > > > > > >  4  size_t size;
> > > > > > > > > >  5  char buf[] __attribute__((counted_by(size)));
> > > > > > > > > >  6 };
> > > > > > > > > >  7
> > > > > > > > > >  8 static size_t
> > > > > > > > > >  9 get_size_from (void *ptr)
> > > > > > > > > > 10 {
> > > > > > > > > > 11  return __builtin_dynamic_object_size (ptr, 1);
> > > > > > > > > > 12 }
> > > > > > > > > > 13
> > > > > > > > > > 14 void
> > > > > > > > > > 15 foo (size_t sz)
> > > > > > > > > > 16 {
> > > > > > > > > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz 
> > > > > > > > > > * sizeof(char));
> > > > > > > > > > 18  obj->size = sz;
> > > > > > > > > > 19  obj->buf[0] = 2;
> > > > > > > > > > 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
> > > > > > > > > > 21  return;
> > > > > > > > > > 22 }
> > > > > > > > > > 23
> > > > > > > > > > 24 int main ()
> > > > > > > > > > 25 {
> > > > > > > > > > 26  foo (20);
> > > > > > > > > > 27  return 0;
> > > > > > > > > > 28 }
> > > > > > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > > When it’s set I suppose.  Turn
> > > > > > > > 
> > > > > > > > X.l = n;
> > > > > > > > 
> > > > > > > > Into
> > > > > > > > 
> > > > > > > > X.l = __builtin_with_size (x.buf, n);
> > > > > > > 
> > > > > > > It would turn
> > > > > > > 
> > > > > > > some_variable = (&) x.buf
> > > > > > > 
> > > > > > > into
> > > > > > > 
> > > > > > > some_variable = __builtin_with_size ( (&) x.buf. x.len)
> > > > > > > 
> > > > > > > 
> > > > > > > So the later access to x.buf and not the initialization
> > > > > > > of a member of the struct (which is too early).
> > > > > > > 
> > > > > > 
> > > > > > Hmm, so with Qing's example above, are you suggesting the 
> > > > > > transformation
> > > > > > be to foo like so:
> > > > > > 
> > > > > > 14 void
> > > > > > 15 foo (size_t sz)
> > > > > > 16 {
> > > > > > 16.5  void * _1;
> > > > > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> > > > > > sizeof(char));
> > > > > > 18  obj->size = sz;
> > > > > > 19  obj->buf[0] = 2;
> > > > > > 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
> > > > > > 20  __builtin_printf (“%d\n", get_size_from (_1));
> > > > > > 21  return;
> > > > > > 22 }
> > > > > > 
> > > > > > If yes then this could indeed work.  I think I got thrown off by the
> > > > > > reference to __bdos.
> > > > > 
> > > > > Yes. I think it is important not to evaluate the size at the
> > > > > access to buf and not the allocation, because the point is to
> > > > > recover it from the size member even when the compiler can't
> > > > > see the original allocation.
> > > > 
> > > > But if the access is through a pointer without the attribute visible
> > > > even the Frontend cannot recover?
> > > 
> > > Yes, if the access is using a struct-with-FAM without the attribute
> > > the FE would not be insert the builtin.  BDOS could potentially
> > > still see the original allocation but if it doesn't, then there is
> > > no information.
> > > 
> > > > We’d need to force type correctness and give up on indirecting
> > > > through an int * when it can refer to two diffenent container types.
> > > > The best we can do I think is mark allocation sites and hope for
> > > > some basic code hygiene (not clobbering size or array pointer
> > > > through pointers without the appropriately attributed type)
> > > 
> > > I am do not fully understand what you are referring to.
> > 
> > struct A { int n; 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Martin Uecker
Am Donnerstag, dem 26.10.2023 um 10:45 +0200 schrieb Richard Biener:
> On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
> > 
> > Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
> > > 
> > > > Am 25.10.2023 um 12:47 schrieb Martin Uecker :
> > > > 
> > > > Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
> > > > > > On 2023-10-25 04:16, Martin Uecker wrote:
> > > > > > Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
> > > > > > > 
> > > > > > > > Am 24.10.2023 um 22:38 schrieb Martin Uecker :
> > > > > > > > 
> > > > > > > > Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
> > > > > > > > > Hi, Sid,
> > > > > > > > > 
> > > > > > > > > Really appreciate for your example and detailed explanation. 
> > > > > > > > > Very helpful.
> > > > > > > > > I think that this example is an excellent example to show 
> > > > > > > > > (almost) all the issues we need to consider.
> > > > > > > > > 
> > > > > > > > > I slightly modified this example to make it to be compilable 
> > > > > > > > > and run-able, as following:
> > > > > > > > > (but I still cannot make the incorrect reordering or DSE 
> > > > > > > > > happening, anyway, the potential reordering possibility is 
> > > > > > > > > there…)
> > > > > > > > > 
> > > > > > > > >  1 #include 
> > > > > > > > >  2 struct A
> > > > > > > > >  3 {
> > > > > > > > >  4  size_t size;
> > > > > > > > >  5  char buf[] __attribute__((counted_by(size)));
> > > > > > > > >  6 };
> > > > > > > > >  7
> > > > > > > > >  8 static size_t
> > > > > > > > >  9 get_size_from (void *ptr)
> > > > > > > > > 10 {
> > > > > > > > > 11  return __builtin_dynamic_object_size (ptr, 1);
> > > > > > > > > 12 }
> > > > > > > > > 13
> > > > > > > > > 14 void
> > > > > > > > > 15 foo (size_t sz)
> > > > > > > > > 16 {
> > > > > > > > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> > > > > > > > > sizeof(char));
> > > > > > > > > 18  obj->size = sz;
> > > > > > > > > 19  obj->buf[0] = 2;
> > > > > > > > > 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
> > > > > > > > > 21  return;
> > > > > > > > > 22 }
> > > > > > > > > 23
> > > > > > > > > 24 int main ()
> > > > > > > > > 25 {
> > > > > > > > > 26  foo (20);
> > > > > > > > > 27  return 0;
> > > > > > > > > 28 }
> > > > > > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > > > When it’s set I suppose.  Turn
> > > > > > > 
> > > > > > > X.l = n;
> > > > > > > 
> > > > > > > Into
> > > > > > > 
> > > > > > > X.l = __builtin_with_size (x.buf, n);
> > > > > > 
> > > > > > It would turn
> > > > > > 
> > > > > > some_variable = (&) x.buf
> > > > > > 
> > > > > > into
> > > > > > 
> > > > > > some_variable = __builtin_with_size ( (&) x.buf. x.len)
> > > > > > 
> > > > > > 
> > > > > > So the later access to x.buf and not the initialization
> > > > > > of a member of the struct (which is too early).
> > > > > > 
> > > > > 
> > > > > Hmm, so with Qing's example above, are you suggesting the 
> > > > > transformation
> > > > > be to foo like so:
> > > > > 
> > > > > 14 void
> > > > > 15 foo (size_t sz)
> > > > > 16 {
> > > > > 16.5  void * _1;
> > > > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> > > > > sizeof(char));
> > > > > 18  obj->size = sz;
> > > > > 19  obj->buf[0] = 2;
> > > > > 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
> > > > > 20  __builtin_printf (“%d\n", get_size_from (_1));
> > > > > 21  return;
> > > > > 22 }
> > > > > 
> > > > > If yes then this could indeed work.  I think I got thrown off by the
> > > > > reference to __bdos.
> > > > 
> > > > Yes. I think it is important not to evaluate the size at the
> > > > access to buf and not the allocation, because the point is to
> > > > recover it from the size member even when the compiler can't
> > > > see the original allocation.
> > > 
> > > But if the access is through a pointer without the attribute visible
> > > even the Frontend cannot recover?
> > 
> > Yes, if the access is using a struct-with-FAM without the attribute
> > the FE would not be insert the builtin.  BDOS could potentially
> > still see the original allocation but if it doesn't, then there is
> > no information.
> > 
> > > We’d need to force type correctness and give up on indirecting
> > > through an int * when it can refer to two diffenent container types.
> > > The best we can do I think is mark allocation sites and hope for
> > > some basic code hygiene (not clobbering size or array pointer
> > > through pointers without the appropriately attributed type)
> > 
> > I am do not fully understand what you are referring to.
> 
> struct A { int n; int data[n]; };
> struct B { long n; int data[n]; };
> 
> int *p = flag ? a->data : b->data;
> 
> access *p;
> 
> Since we need to allow interoperability of pointers (a->data is
> convertible to a non-fat pointer of type int *) this leaves us with
> ambiguity we need to conservatively handle to avoid false positives.

For BDOS, I would expect 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Richard Biener
On Thu, Oct 26, 2023 at 7:22 AM Jakub Jelinek  wrote:
>
> On Wed, Oct 25, 2023 at 07:03:43PM +, Qing Zhao wrote:
> > For the code generation impact:
> >
> > turning the original  x.buf
> > to a builtin function call
> > __builtin_with_access_and_size(x,buf, x.L,-1)
> >
> > might inhibit some optimizations from happening before the builtin is
> > evaluated into object size info (phase  .objsz1).  I guess there might be
> > some performance impact.
> >
> > However, if we mark this builtin as PURE, NOTRROW, etc, then the negative
> > performance impact will be reduced to minimum?
>
> You can't drop it during objsz1 pass though, otherwise __bdos wouldn't
> be able to figure out the dynamic sizes in case of normal (non-early)
> inlining - caller takes address of a counted_by array, passes it down
> to callee which is only inlined late and uses __bdos, or callee takes address
> and returns it and caller uses __bdos, etc. - so it would need to be objsz2.
>
> And while the builtin (or if it is an internal detail rather than user
> accessible builtin an internal function) could be even const/nothrow/leaf if
> the arguments contain the loads from the structure 2 fields, I'm afraid it
> will still have huge code generation impact, prevent tons of pre-IPA
> optimizations.  And it will need some work to handle it properly during
> inlining heuristics, because in GIMPLE the COMPONENT_REF loads aren't gimple
> values, so it wouldn't be just the builtin/internal-fn call to be ignored,
> but also the count load from memory.

I think we want to track the value, not the "memory" in the builtin call,
so GIMPLE would be

 _1 = x.L;
 .. = __builtin_with_access_and_size (, _1, -1);

also please make sure to use an internal function for
__builtin_with_access_and_size,
I don't think we want to expose this to users - it's an implementation detail.

Richard.

>
> Jakub
>


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Richard Biener
On Wed, Oct 25, 2023 at 8:16 PM Martin Uecker  wrote:
>
> Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
> >
> > > Am 25.10.2023 um 12:47 schrieb Martin Uecker :
> > >
> > > Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
> > > > > On 2023-10-25 04:16, Martin Uecker wrote:
> > > > > Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
> > > > > >
> > > > > > > Am 24.10.2023 um 22:38 schrieb Martin Uecker :
> > > > > > >
> > > > > > > Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
> > > > > > > > Hi, Sid,
> > > > > > > >
> > > > > > > > Really appreciate for your example and detailed explanation. 
> > > > > > > > Very helpful.
> > > > > > > > I think that this example is an excellent example to show 
> > > > > > > > (almost) all the issues we need to consider.
> > > > > > > >
> > > > > > > > I slightly modified this example to make it to be compilable 
> > > > > > > > and run-able, as following:
> > > > > > > > (but I still cannot make the incorrect reordering or DSE 
> > > > > > > > happening, anyway, the potential reordering possibility is 
> > > > > > > > there…)
> > > > > > > >
> > > > > > > >  1 #include 
> > > > > > > >  2 struct A
> > > > > > > >  3 {
> > > > > > > >  4  size_t size;
> > > > > > > >  5  char buf[] __attribute__((counted_by(size)));
> > > > > > > >  6 };
> > > > > > > >  7
> > > > > > > >  8 static size_t
> > > > > > > >  9 get_size_from (void *ptr)
> > > > > > > > 10 {
> > > > > > > > 11  return __builtin_dynamic_object_size (ptr, 1);
> > > > > > > > 12 }
> > > > > > > > 13
> > > > > > > > 14 void
> > > > > > > > 15 foo (size_t sz)
> > > > > > > > 16 {
> > > > > > > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> > > > > > > > sizeof(char));
> > > > > > > > 18  obj->size = sz;
> > > > > > > > 19  obj->buf[0] = 2;
> > > > > > > > 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
> > > > > > > > 21  return;
> > > > > > > > 22 }
> > > > > > > > 23
> > > > > > > > 24 int main ()
> > > > > > > > 25 {
> > > > > > > > 26  foo (20);
> > > > > > > > 27  return 0;
> > > > > > > > 28 }
> > > > > > > >
> > > >
> > > > 
> > > >
> > > > > > When it’s set I suppose.  Turn
> > > > > >
> > > > > > X.l = n;
> > > > > >
> > > > > > Into
> > > > > >
> > > > > > X.l = __builtin_with_size (x.buf, n);
> > > > >
> > > > > It would turn
> > > > >
> > > > > some_variable = (&) x.buf
> > > > >
> > > > > into
> > > > >
> > > > > some_variable = __builtin_with_size ( (&) x.buf. x.len)
> > > > >
> > > > >
> > > > > So the later access to x.buf and not the initialization
> > > > > of a member of the struct (which is too early).
> > > > >
> > > >
> > > > Hmm, so with Qing's example above, are you suggesting the transformation
> > > > be to foo like so:
> > > >
> > > > 14 void
> > > > 15 foo (size_t sz)
> > > > 16 {
> > > > 16.5  void * _1;
> > > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> > > > sizeof(char));
> > > > 18  obj->size = sz;
> > > > 19  obj->buf[0] = 2;
> > > > 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
> > > > 20  __builtin_printf (“%d\n", get_size_from (_1));
> > > > 21  return;
> > > > 22 }
> > > >
> > > > If yes then this could indeed work.  I think I got thrown off by the
> > > > reference to __bdos.
> > >
> > > Yes. I think it is important not to evaluate the size at the
> > > access to buf and not the allocation, because the point is to
> > > recover it from the size member even when the compiler can't
> > > see the original allocation.
> >
> > But if the access is through a pointer without the attribute visible
> > even the Frontend cannot recover?
>
> Yes, if the access is using a struct-with-FAM without the attribute
> the FE would not be insert the builtin.  BDOS could potentially
> still see the original allocation but if it doesn't, then there is
> no information.
>
> > We’d need to force type correctness and give up on indirecting
> > through an int * when it can refer to two diffenent container types.
> > The best we can do I think is mark allocation sites and hope for
> > some basic code hygiene (not clobbering size or array pointer
> > through pointers without the appropriately attributed type)
>
> I am do not fully understand what you are referring to.

struct A { int n; int data[n]; };
struct B { long n; int data[n]; };

int *p = flag ? a->data : b->data;

access *p;

Since we need to allow interoperability of pointers (a->data is
convertible to a non-fat pointer of type int *) this leaves us with
ambiguity we need to conservatively handle to avoid false positives.

We _might_ want to diagnose decay of a->data to int *, but IIRC
there's no way (or proposal) to allow declaring a corresponding
fat pointer, so it's not a good designed feature.

Having __builtin_with_size at allocation would possibly make
the BOS use-def walk discover both objects.  I think you can't
insert __builtin_with_size at the access to *p, but in practice
that would be very much 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-26 Thread Martin Uecker
Am Mittwoch, dem 25.10.2023 um 15:32 -0700 schrieb Kees Cook:
> On Wed, Oct 25, 2023 at 10:27:41PM +, Qing Zhao wrote:
> > 
> > 
> > > On Oct 25, 2023, at 6:06 PM, Kees Cook  wrote:
> > > 
> > > On Wed, Oct 25, 2023 at 01:27:29PM +, Qing Zhao wrote:
> > > > A.  Add an additional argument, the size parameter,  to __bdos, 
> > > > A.1, during FE;
> > > > A.2, during gimplification phase;
> > > 
> > > I just wanted to clarify that this is all just an "internal" detail,
> > > yes?
> > 
> > YES!
> 
> Okay, I thought so, but I just wanted to double-check. :)
> 
> > > For example, the Linux kernel can still use __bdos() without knowing
> > > the count member ahead of time (otherwise it kind of defeats the purpose).
> > Don’t quite understand this, could you clarify? 
> 
> I was just trying to explain why a chance would be a problem. But it
> doesn't matter, so nevermind. :)
> 
> > (Anyway, the bottom line is no change to the user interface, we just 
> > discuss the internal implementation inside GCC) -:)
> 
> Great! I'll go back to lurking. :)
> 
> Thanks!
> 

While it is about the internal implementation, it would
potentially affect the semantics of the attribute:

This would work:

x->count = 10;
char *p = >buf;

but not this:

char *p = >buf;
x->count = 1;
p[10] = 1; // !

(because the pointer is passed around the
store to the counter)

and also here the second store is then irrelevant
for the access:

x->count = 10;
char* p = >buf;
...
x->count = 1; // somewhere else

p[9] = 1; // ok, because count matter when buf was accesssed.


IMHO this makes sense also from the user side and
are the desirable semantics we discussed before.

But can you take a look at this?


This should simulate it fairly well:
https://godbolt.org/z/xq89aM7Gr

(the call to the noinline function would go away,
but not necessarily its impact on optimization)

Martin






Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Jakub Jelinek
On Wed, Oct 25, 2023 at 07:03:43PM +, Qing Zhao wrote:
> For the code generation impact:
> 
> turning the original  x.buf 
> to a builtin function call
> __builtin_with_access_and_size(x,buf, x.L,-1)
> 
> might inhibit some optimizations from happening before the builtin is
> evaluated into object size info (phase  .objsz1).  I guess there might be
> some performance impact.
> 
> However, if we mark this builtin as PURE, NOTRROW, etc, then the negative
> performance impact will be reduced to minimum?

You can't drop it during objsz1 pass though, otherwise __bdos wouldn't
be able to figure out the dynamic sizes in case of normal (non-early)
inlining - caller takes address of a counted_by array, passes it down
to callee which is only inlined late and uses __bdos, or callee takes address
and returns it and caller uses __bdos, etc. - so it would need to be objsz2.

And while the builtin (or if it is an internal detail rather than user
accessible builtin an internal function) could be even const/nothrow/leaf if
the arguments contain the loads from the structure 2 fields, I'm afraid it
will still have huge code generation impact, prevent tons of pre-IPA
optimizations.  And it will need some work to handle it properly during
inlining heuristics, because in GIMPLE the COMPONENT_REF loads aren't gimple
values, so it wouldn't be just the builtin/internal-fn call to be ignored,
but also the count load from memory.

Jakub



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Kees Cook
On Wed, Oct 25, 2023 at 10:27:41PM +, Qing Zhao wrote:
> 
> 
> > On Oct 25, 2023, at 6:06 PM, Kees Cook  wrote:
> > 
> > On Wed, Oct 25, 2023 at 01:27:29PM +, Qing Zhao wrote:
> >> A.  Add an additional argument, the size parameter,  to __bdos, 
> >> A.1, during FE;
> >> A.2, during gimplification phase;
> > 
> > I just wanted to clarify that this is all just an "internal" detail,
> > yes?
> 
> YES!

Okay, I thought so, but I just wanted to double-check. :)

> > For example, the Linux kernel can still use __bdos() without knowing
> > the count member ahead of time (otherwise it kind of defeats the purpose).
> Don’t quite understand this, could you clarify? 

I was just trying to explain why a chance would be a problem. But it
doesn't matter, so nevermind. :)

> (Anyway, the bottom line is no change to the user interface, we just discuss 
> the internal implementation inside GCC) -:)

Great! I'll go back to lurking. :)

Thanks!

-- 
Kees Cook


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Qing Zhao


> On Oct 25, 2023, at 6:06 PM, Kees Cook  wrote:
> 
> On Wed, Oct 25, 2023 at 01:27:29PM +, Qing Zhao wrote:
>> A.  Add an additional argument, the size parameter,  to __bdos, 
>> A.1, during FE;
>> A.2, during gimplification phase;
> 
> I just wanted to clarify that this is all just an "internal" detail,
> yes?

YES!

> i.e. the __bdos() used by in C code is unchanged?

there should be no change to the user interface. 

> 
> For example, the Linux kernel can still use __bdos() without knowing
> the count member ahead of time (otherwise it kind of defeats the purpose).
Don’t quite understand this, could you clarify? 

(Anyway, the bottom line is no change to the user interface, we just discuss 
the internal implementation inside GCC) -:)

Qing
> 
> -- 
> Kees Cook



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Kees Cook
On Wed, Oct 25, 2023 at 01:27:29PM +, Qing Zhao wrote:
> A.  Add an additional argument, the size parameter,  to __bdos, 
>  A.1, during FE;
>  A.2, during gimplification phase;

I just wanted to clarify that this is all just an "internal" detail,
yes? i.e. the __bdos() used by in C code is unchanged?

For example, the Linux kernel can still use __bdos() without knowing
the count member ahead of time (otherwise it kind of defeats the purpose).

-- 
Kees Cook


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Kees Cook
On Tue, Oct 24, 2023 at 07:51:55PM -0400, Siddhesh Poyarekar wrote:
> Yes, that's the tradeoff.  However, maybe this is the point where Kees jumps
> in and say the kernel doesn't really care as much or something like that :)

"I only care about -O2" :)

-- 
Kees Cook


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Qing Zhao


> On Oct 25, 2023, at 11:38 AM, Richard Biener  
> wrote:
> 
> 
> 
>> Am 25.10.2023 um 16:50 schrieb Siddhesh Poyarekar :
>> 
>> On 2023-10-25 09:27, Qing Zhao wrote:
> On Oct 24, 2023, at 7:56 PM, Siddhesh Poyarekar  
> wrote:
 
 On 2023-10-24 18:51, Qing Zhao wrote:
> Thanks for the proposal!
> So what you suggested is:
> For every x.buf,  change it as a __builtin_with_size(x.buf, x.L) in the 
> FE, then the call to the _bdos (x.buf, 1) will
> Become:
>   _bdos(__builtin_with_size(x.buf, x.L), 1)?
> Then the implicit use of x.L in _bdos(x.buf.1) will become explicit?
 
 Oops, I think Martin and I fell off-list in a subthread.  I clarified that 
 my comment was that any such annotation at object reference is probably 
 too late and hence not the right place for it; basically it has the same 
 problems as the option A in your comment.  A better place to reinforce 
 such a relationship would be the allocation+initialization site instead.
>>> I think Martin’s proposal might work, it’s different than the option A:
>>> A.  Add an additional argument, the size parameter,  to __bdos,
>>> A.1, during FE;
>>> A.2, during gimplification phase;
>>> Option A targets on the __bdos call, try to encode the implicit use to the 
>>> call, this will not work when the real object has not been instantiation at 
>>> the call site.
>>> However, Martin’s proposal targets on the FMA array itself, it will enhance 
>>> the FAM access naturally with the size information. And such FAM access 
>>> with size info will propagated to the __bdos site later through inlining, 
>>> etc. and then tree-object-size can use the size information at that point. 
>>> At the same time, the implicit use of the size is recorded correctly.
>>> So, I think that this proposal is natural and reasonable.
>> 
>> Ack, we discussed this later in the thread and I agree[1].  Richard still 
>> has concerns[2] that I think may be addressed by putting __builtin_with_size 
>> at the point where the reference to x.buf escapes, but I'm not very sure 
>> about that.
>> 
>> Oh, and Martin suggested using __builtin_with_size more generally[3] in 
>> bugzilla to address attribute inlining issues and we have high level 
>> consensus for a __builtin_with_access instead, which associates access type 
>> in addition to size with the target object.  For the purposes of counted_by, 
>> access type could simply be -1.
> 
> Btw, I’d like to see some hard numbers on the amount of extra false positives 
> this will cause a well as the effect on generated code before putting this in 
> mainline and effectively needing to support it forever. 

What do you mean by the “extra false positives”? 

For the code generation impact:

turning the original  x.buf 
to a builtin function call
__builtin_with_access_and_size(x,buf, x.L,-1)

might inhibit some optimizations from happening before the builtin is evaluated 
into object size info (phase  .objsz1).  I guess there might be some 
performance impact. 

However, if we mark this builtin as PURE, NOTRROW, etc, then the negative 
performance impact will be reduced to minimum? 

Qing

> 
> Richard 
> 
>> Thanks,
>> Sid
>> 
>> 
>> [1] 
>> https://inbox.sourceware.org/gcc-patches/73af949c-3caa-4b11-93ce-3064b95a9...@gotplt.org/T/#m4f3cafa489493180e258fd62aca0196a5f244039
>> 
>> [2] 
>> https://inbox.sourceware.org/gcc-patches/73af949c-3caa-4b11-93ce-3064b95a9...@gotplt.org/T/#mcf226f891621db8b640deaedd8942bb8519010f3
>> 
>> [3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503#c6



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Qing Zhao


> On Oct 25, 2023, at 10:50 AM, Siddhesh Poyarekar  wrote:
> 
> On 2023-10-25 09:27, Qing Zhao wrote:
>>> On Oct 24, 2023, at 7:56 PM, Siddhesh Poyarekar  wrote:
>>> 
>>> On 2023-10-24 18:51, Qing Zhao wrote:
 Thanks for the proposal!
 So what you suggested is:
 For every x.buf,  change it as a __builtin_with_size(x.buf, x.L) in the 
 FE, then the call to the _bdos (x.buf, 1) will
 Become:
_bdos(__builtin_with_size(x.buf, x.L), 1)?
 Then the implicit use of x.L in _bdos(x.buf.1) will become explicit?
>>> 
>>> Oops, I think Martin and I fell off-list in a subthread.  I clarified that 
>>> my comment was that any such annotation at object reference is probably too 
>>> late and hence not the right place for it; basically it has the same 
>>> problems as the option A in your comment.  A better place to reinforce such 
>>> a relationship would be the allocation+initialization site instead.
>> I think Martin’s proposal might work, it’s different than the option A:
>> A.  Add an additional argument, the size parameter,  to __bdos,
>>  A.1, during FE;
>>  A.2, during gimplification phase;
>> Option A targets on the __bdos call, try to encode the implicit use to the 
>> call, this will not work when the real object has not been instantiation at 
>> the call site.
>> However, Martin’s proposal targets on the FMA array itself, it will enhance 
>> the FAM access naturally with the size information. And such FAM access with 
>> size info will propagated to the __bdos site later through inlining, etc. 
>> and then tree-object-size can use the size information at that point. At the 
>> same time, the implicit use of the size is recorded correctly.
>> So, I think that this proposal is natural and reasonable.
> 
> Ack, we discussed this later in the thread and I agree[1].  Richard still has 
> concerns[2] that I think may be addressed by putting __builtin_with_size at 
> the point where the reference to x.buf escapes, but I'm not very sure about 
> that.
> 
> Oh, and Martin suggested using __builtin_with_size more generally[3] in 
> bugzilla to address attribute inlining issues and we have high level 
> consensus for a __builtin_with_access instead, which associates access type 
> in addition to size with the target object.  For the purposes of counted_by, 
> access type could simply be -1.

Yes, I read all the discussions in the comments of PR96503 
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503), and I do agree that this 
is a good idea. 

I prefer the name for the new builtin as:  
__builtin_with_access_and_size
Instead of 
__builtin_with_access

All the attributes, “alloca_size”, “access”, and the new “counted_by” for FMA, 
could be converted to this builtin consistently, and even the later new 
extension, for example, “counted_by” attribute for general pointers, could use 
the same builtin. 

SOMETYPE *ptr = __builtin_with_access_and_size (SOMETYPE *ptr, size_t size, int 
access)

In the above, 

1. SOMETYPE will be the type of the pointee of “ptr”, it could be a real type 
or void.

2. “size”

If SOMETYPE is a real type, the “size” will be the number of elements of the 
type;
If SOMETYPE is void, the “size” will be the number of bytes.   

3. “access”

-1: Unknown access semantics
0: none
1: read_only
2: write_only
3: read_write

For the “counted_by” and “alloca_size” attribute, the “access” will be -1. 

Qing
> 
> Thanks,
> Sid
> 
> 
> [1] 
> https://inbox.sourceware.org/gcc-patches/73af949c-3caa-4b11-93ce-3064b95a9...@gotplt.org/T/#m4f3cafa489493180e258fd62aca0196a5f244039
> 
> [2] 
> https://inbox.sourceware.org/gcc-patches/73af949c-3caa-4b11-93ce-3064b95a9...@gotplt.org/T/#mcf226f891621db8b640deaedd8942bb8519010f3
> 
> [3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503#c6



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Qing Zhao


> On Oct 25, 2023, at 7:13 AM, Richard Biener  
> wrote:
> 
> 
> 
>> Am 25.10.2023 um 12:47 schrieb Martin Uecker :
>> 
>> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
 On 2023-10-25 04:16, Martin Uecker wrote:
 Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
> 
>> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
>> 
>> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>>> Hi, Sid,
>>> 
>>> Really appreciate for your example and detailed explanation. Very 
>>> helpful.
>>> I think that this example is an excellent example to show (almost) all 
>>> the issues we need to consider.
>>> 
>>> I slightly modified this example to make it to be compilable and 
>>> run-able, as following:
>>> (but I still cannot make the incorrect reordering or DSE happening, 
>>> anyway, the potential reordering possibility is there…)
>>> 
>>> 1 #include 
>>> 2 struct A
>>> 3 {
>>> 4  size_t size;
>>> 5  char buf[] __attribute__((counted_by(size)));
>>> 6 };
>>> 7
>>> 8 static size_t
>>> 9 get_size_from (void *ptr)
>>> 10 {
>>> 11  return __builtin_dynamic_object_size (ptr, 1);
>>> 12 }
>>> 13
>>> 14 void
>>> 15 foo (size_t sz)
>>> 16 {
>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>>> sizeof(char));
>>> 18  obj->size = sz;
>>> 19  obj->buf[0] = 2;
>>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>>> 21  return;
>>> 22 }
>>> 23
>>> 24 int main ()
>>> 25 {
>>> 26  foo (20);
>>> 27  return 0;
>>> 28 }
>>> 
>>> 
>>> 
>>> 
> When it’s set I suppose.  Turn
> 
> X.l = n;
> 
> Into
> 
> X.l = __builtin_with_size (x.buf, n);
 
 It would turn
 
 some_variable = (&) x.buf
 
 into
 
 some_variable = __builtin_with_size ( (&) x.buf. x.len)
 
 
 So the later access to x.buf and not the initialization
 of a member of the struct (which is too early).
 
>>> 
>>> Hmm, so with Qing's example above, are you suggesting the transformation 
>>> be to foo like so:
>>> 
>>> 14 void
>>> 15 foo (size_t sz)
>>> 16 {
>>> 16.5  void * _1;
>>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>>> 18  obj->size = sz;
>>> 19  obj->buf[0] = 2;
>>> 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
>>> 20  __builtin_printf (“%d\n", get_size_from (_1));
>>> 21  return;
>>> 22 }
>>> 
>>> If yes then this could indeed work.  I think I got thrown off by the 
>>> reference to __bdos.
>> 
>> Yes. I think it is important not to evaluate the size at the
>> access to buf and not the allocation, because the point is to 
>> recover it from the size member even when the compiler can't 
>> see the original allocation.
> 
> But if the access is through a pointer without the attribute visible even the 
> Frontend cannot recover?  We’d need to force type correctness and give up on 
> indirecting through an int * when it can refer to two diffenent container 
> types.

Might need issue warnings when this happens?

>  The best we can do I think is mark allocation sites and hope for some basic 
> code hygiene (not clobbering size or array pointer through pointers without 
> the appropriately attributed type)
I guess that we need to clarify the requirement in the documentation, and also 
issue warnings when the source code has such issues.

Qing
> 
>> Evaluating at this point requires that the size is correctly set
>> before the access to the FAM and the user has to make sure 
>> this is the case. But to me this requirement would make sense.
>> 
>> Semantically, it could aöso make sense to evaluate the size at a
>> later time.  But then the reordering becomes problematic again.
>> 
>> Also I think this would make this feature generally more useful.
>> For example, it could work also for others pointers in the struct
>> and not just for FAMs.  In this case, the struct may already be
>> freed when  BDOS is called, so it might also not possible to
>> access the size member at a later time.
>> 
>> Martin
>> 
>> 
>>> 
>> 



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Martin Uecker
Am Mittwoch, dem 25.10.2023 um 13:13 +0200 schrieb Richard Biener:
> 
> > Am 25.10.2023 um 12:47 schrieb Martin Uecker :
> > 
> > Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
> > > > On 2023-10-25 04:16, Martin Uecker wrote:
> > > > Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
> > > > > 
> > > > > > Am 24.10.2023 um 22:38 schrieb Martin Uecker :
> > > > > > 
> > > > > > Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
> > > > > > > Hi, Sid,
> > > > > > > 
> > > > > > > Really appreciate for your example and detailed explanation. Very 
> > > > > > > helpful.
> > > > > > > I think that this example is an excellent example to show 
> > > > > > > (almost) all the issues we need to consider.
> > > > > > > 
> > > > > > > I slightly modified this example to make it to be compilable and 
> > > > > > > run-able, as following:
> > > > > > > (but I still cannot make the incorrect reordering or DSE 
> > > > > > > happening, anyway, the potential reordering possibility is there…)
> > > > > > > 
> > > > > > >  1 #include 
> > > > > > >  2 struct A
> > > > > > >  3 {
> > > > > > >  4  size_t size;
> > > > > > >  5  char buf[] __attribute__((counted_by(size)));
> > > > > > >  6 };
> > > > > > >  7
> > > > > > >  8 static size_t
> > > > > > >  9 get_size_from (void *ptr)
> > > > > > > 10 {
> > > > > > > 11  return __builtin_dynamic_object_size (ptr, 1);
> > > > > > > 12 }
> > > > > > > 13
> > > > > > > 14 void
> > > > > > > 15 foo (size_t sz)
> > > > > > > 16 {
> > > > > > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> > > > > > > sizeof(char));
> > > > > > > 18  obj->size = sz;
> > > > > > > 19  obj->buf[0] = 2;
> > > > > > > 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
> > > > > > > 21  return;
> > > > > > > 22 }
> > > > > > > 23
> > > > > > > 24 int main ()
> > > > > > > 25 {
> > > > > > > 26  foo (20);
> > > > > > > 27  return 0;
> > > > > > > 28 }
> > > > > > > 
> > > 
> > > 
> > > 
> > > > > When it’s set I suppose.  Turn
> > > > > 
> > > > > X.l = n;
> > > > > 
> > > > > Into
> > > > > 
> > > > > X.l = __builtin_with_size (x.buf, n);
> > > > 
> > > > It would turn
> > > > 
> > > > some_variable = (&) x.buf
> > > > 
> > > > into
> > > > 
> > > > some_variable = __builtin_with_size ( (&) x.buf. x.len)
> > > > 
> > > > 
> > > > So the later access to x.buf and not the initialization
> > > > of a member of the struct (which is too early).
> > > > 
> > > 
> > > Hmm, so with Qing's example above, are you suggesting the transformation 
> > > be to foo like so:
> > > 
> > > 14 void
> > > 15 foo (size_t sz)
> > > 16 {
> > > 16.5  void * _1;
> > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> > > sizeof(char));
> > > 18  obj->size = sz;
> > > 19  obj->buf[0] = 2;
> > > 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
> > > 20  __builtin_printf (“%d\n", get_size_from (_1));
> > > 21  return;
> > > 22 }
> > > 
> > > If yes then this could indeed work.  I think I got thrown off by the 
> > > reference to __bdos.
> > 
> > Yes. I think it is important not to evaluate the size at the
> > access to buf and not the allocation, because the point is to 
> > recover it from the size member even when the compiler can't 
> > see the original allocation.
> 
> But if the access is through a pointer without the attribute visible
> even the Frontend cannot recover?  

Yes, if the access is using a struct-with-FAM without the attribute
the FE would not be insert the builtin.  BDOS could potentially
still see the original allocation but if it doesn't, then there is
no information.

> We’d need to force type correctness and give up on indirecting
> through an int * when it can refer to two diffenent container types. 
> The best we can do I think is mark allocation sites and hope for
> some basic code hygiene (not clobbering size or array pointer
> through pointers without the appropriately attributed type)

I am do not fully understand what you are referring to. But yes,
for full bounds safety we would need the language feature.
In C people should start to variably-modified types
more.  I think we can build perfect bounds safety on top of
them in a very good way with only FE changes.

All these attributes are just a best effort.  But for a while,
this will be necessary.

Martin

> 
> > Evaluating at this point requires that the size is correctly set
> > before the access to the FAM and the user has to make sure 
> > this is the case. But to me this requirement would make sense.
> > 
> > Semantically, it could aöso make sense to evaluate the size at a
> > later time.  But then the reordering becomes problematic again.
> > 
> > Also I think this would make this feature generally more useful.
> > For example, it could work also for others pointers in the struct
> > and not just for FAMs.  In this case, the struct may already be
> > freed when  BDOS is called, so it might also not possible to
> > access the size member at a later 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Qing Zhao


> On Oct 25, 2023, at 6:39 AM, Martin Uecker  wrote:
> 
> Am Mittwoch, dem 25.10.2023 um 12:25 +0200 schrieb Richard Biener:
>> 
>>> Am 25.10.2023 um 10:16 schrieb Martin Uecker :
>>> 
>>> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
 
>> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
> 
> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>> Hi, Sid,
>> 
>> Really appreciate for your example and detailed explanation. Very 
>> helpful.
>> I think that this example is an excellent example to show (almost) all 
>> the issues we need to consider.
>> 
>> I slightly modified this example to make it to be compilable and 
>> run-able, as following: 
>> (but I still cannot make the incorrect reordering or DSE happening, 
>> anyway, the potential reordering possibility is there…)
>> 
>> 1 #include 
>> 2 struct A
>> 3 {
>> 4  size_t size;
>> 5  char buf[] __attribute__((counted_by(size)));
>> 6 };
>> 7 
>> 8 static size_t
>> 9 get_size_from (void *ptr)
>> 10 {
>> 11  return __builtin_dynamic_object_size (ptr, 1);
>> 12 }
>> 13 
>> 14 void
>> 15 foo (size_t sz)
>> 16 {
>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>> sizeof(char));
>> 18  obj->size = sz;
>> 19  obj->buf[0] = 2;
>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>> 21  return;
>> 22 }
>> 23 
>> 24 int main ()
>> 25 {
>> 26  foo (20);
>> 27  return 0;
>> 28 }
>> 
>> With my GCC, it was compiled and worked:
>> [opc@qinzhao-ol8u3-x86 ]$  /home/opc/Install/latest-d/bin/gcc -O1 t5.c
>> [opc@qinzhao-ol8u3-x86 ]$ ./a.out
>> 20
>> Situation 1: With O1 and above, the routine “get_size_from” was inlined 
>> into “foo”, therefore, the call to __bdos is in the same routine as the 
>> instantiation of the object, and the TYPE information and the attached 
>> counted_by attribute information in the TYPE of the object can be USED 
>> by the __bdos call to compute the final object size. 
>> 
>> [opc@qinzhao-ol8u3-x86]$  /home/opc/Install/latest-d/bin/gcc -O0  t5.c
>> [opc@qinzhao-ol8u3-x86 ]$ ./a.out
>> -1
>> Situation 2: With O0, the routine “get_size_from” was NOT inlined into 
>> “foo”, therefore, the call to __bdos is Not in the same routine as the 
>> instantiation of the object, As a result, the TYPE info and the attached 
>> counted_by info of the object can NOT be USED by the __bdos call. 
>> 
>> Keep in mind of the above 2 situations, we will refer them in below:
>> 
>> 1. First,  the problem we are trying to resolve is:
>> 
>> (Your description):
>> 
>>> the reordering of __bdos w.r.t. initialization of the size parameter 
>>> but to also account for DSE of the assignment, we can abstract this 
>>> problem to that of DFA being unable to see implicit use of the size 
>>> parameter in the __bdos call.
>> 
>> basically is correct.  However, with the following exception:
>> 
>> The implicit use of the size parameter in the __bdos call is not always 
>> there, it ONLY exists WHEN the __bdos is able to evaluated to an 
>> expression of the size parameter in the “objsz” phase, i.e., the 
>> “Situation 1” of the above example. 
>> In the “Situation 2”, when the __bdos does not see the TYPE of the real 
>> object,  it does not see the counted_by information from the TYPE, 
>> therefore,  it is not able to evaluate the size of the object through 
>> the counted_by information.  As a result, the implicit use of the size 
>> parameter in the __bdos call does NOT exist at all.  The optimizer can 
>> freely reorder the initialization of the size parameter with the __bdos 
>> call since there is no data flow dependency between these two. 
>> 
>> With this exception in mind, we can see that your proposed “option 2” 
>> (making the type of size “volatile”) is too conservative, it will  
>> disable many optimizations  unnecessarily, even though it’s safe and 
>> simple to implement. 
>> 
>> As a compiler optimization person for many many years, I really don’t 
>> want to take this approach at this moment.  -:)
>> 
>> 2. Some facts I’d like to mention:
>> 
>> A.  The incorrect reordering (or CSE) potential ONLY exists in the TREE 
>> optimization stage. During RTL stage,  the __bdos call has already been 
>> replaced by an expression of the size parameter or a constant, the data 
>> dependency is explicitly in the IR already.  I believe that the data 
>> analysis in RTL stage should pick up the data dependency correctly, No 
>> special handling is needed in RTL.
>> 
>> B. If the __bdos call cannot see the real object , it has no way to get 
>> the “counted_by” field from the TYPE of the 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Richard Biener



> Am 25.10.2023 um 16:50 schrieb Siddhesh Poyarekar :
> 
> On 2023-10-25 09:27, Qing Zhao wrote:
 On Oct 24, 2023, at 7:56 PM, Siddhesh Poyarekar  
 wrote:
>>> 
>>> On 2023-10-24 18:51, Qing Zhao wrote:
 Thanks for the proposal!
 So what you suggested is:
 For every x.buf,  change it as a __builtin_with_size(x.buf, x.L) in the 
 FE, then the call to the _bdos (x.buf, 1) will
 Become:
_bdos(__builtin_with_size(x.buf, x.L), 1)?
 Then the implicit use of x.L in _bdos(x.buf.1) will become explicit?
>>> 
>>> Oops, I think Martin and I fell off-list in a subthread.  I clarified that 
>>> my comment was that any such annotation at object reference is probably too 
>>> late and hence not the right place for it; basically it has the same 
>>> problems as the option A in your comment.  A better place to reinforce such 
>>> a relationship would be the allocation+initialization site instead.
>> I think Martin’s proposal might work, it’s different than the option A:
>> A.  Add an additional argument, the size parameter,  to __bdos,
>>  A.1, during FE;
>>  A.2, during gimplification phase;
>> Option A targets on the __bdos call, try to encode the implicit use to the 
>> call, this will not work when the real object has not been instantiation at 
>> the call site.
>> However, Martin’s proposal targets on the FMA array itself, it will enhance 
>> the FAM access naturally with the size information. And such FAM access with 
>> size info will propagated to the __bdos site later through inlining, etc. 
>> and then tree-object-size can use the size information at that point. At the 
>> same time, the implicit use of the size is recorded correctly.
>> So, I think that this proposal is natural and reasonable.
> 
> Ack, we discussed this later in the thread and I agree[1].  Richard still has 
> concerns[2] that I think may be addressed by putting __builtin_with_size at 
> the point where the reference to x.buf escapes, but I'm not very sure about 
> that.
> 
> Oh, and Martin suggested using __builtin_with_size more generally[3] in 
> bugzilla to address attribute inlining issues and we have high level 
> consensus for a __builtin_with_access instead, which associates access type 
> in addition to size with the target object.  For the purposes of counted_by, 
> access type could simply be -1.

Btw, I’d like to see some hard numbers on the amount of extra false positives 
this will cause a well as the effect on generated code before putting this in 
mainline and effectively needing to support it forever.

Richard 

> Thanks,
> Sid
> 
> 
> [1] 
> https://inbox.sourceware.org/gcc-patches/73af949c-3caa-4b11-93ce-3064b95a9...@gotplt.org/T/#m4f3cafa489493180e258fd62aca0196a5f244039
> 
> [2] 
> https://inbox.sourceware.org/gcc-patches/73af949c-3caa-4b11-93ce-3064b95a9...@gotplt.org/T/#mcf226f891621db8b640deaedd8942bb8519010f3
> 
> [3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503#c6


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Siddhesh Poyarekar

On 2023-10-25 09:27, Qing Zhao wrote:




On Oct 24, 2023, at 7:56 PM, Siddhesh Poyarekar  wrote:

On 2023-10-24 18:51, Qing Zhao wrote:

Thanks for the proposal!
So what you suggested is:
For every x.buf,  change it as a __builtin_with_size(x.buf, x.L) in the FE, 
then the call to the _bdos (x.buf, 1) will
Become:
_bdos(__builtin_with_size(x.buf, x.L), 1)?
Then the implicit use of x.L in _bdos(x.buf.1) will become explicit?


Oops, I think Martin and I fell off-list in a subthread.  I clarified that my 
comment was that any such annotation at object reference is probably too late 
and hence not the right place for it; basically it has the same problems as the 
option A in your comment.  A better place to reinforce such a relationship 
would be the allocation+initialization site instead.


I think Martin’s proposal might work, it’s different than the option A:

A.  Add an additional argument, the size parameter,  to __bdos,
  A.1, during FE;
  A.2, during gimplification phase;

Option A targets on the __bdos call, try to encode the implicit use to the 
call, this will not work when the real object has not been instantiation at the 
call site.

However, Martin’s proposal targets on the FMA array itself, it will enhance the 
FAM access naturally with the size information. And such FAM access with size 
info will propagated to the __bdos site later through inlining, etc. and then 
tree-object-size can use the size information at that point. At the same time, 
the implicit use of the size is recorded correctly.

So, I think that this proposal is natural and reasonable.


Ack, we discussed this later in the thread and I agree[1].  Richard 
still has concerns[2] that I think may be addressed by putting 
__builtin_with_size at the point where the reference to x.buf escapes, 
but I'm not very sure about that.


Oh, and Martin suggested using __builtin_with_size more generally[3] in 
bugzilla to address attribute inlining issues and we have high level 
consensus for a __builtin_with_access instead, which associates access 
type in addition to size with the target object.  For the purposes of 
counted_by, access type could simply be -1.


Thanks,
Sid


[1] 
https://inbox.sourceware.org/gcc-patches/73af949c-3caa-4b11-93ce-3064b95a9...@gotplt.org/T/#m4f3cafa489493180e258fd62aca0196a5f244039


[2] 
https://inbox.sourceware.org/gcc-patches/73af949c-3caa-4b11-93ce-3064b95a9...@gotplt.org/T/#mcf226f891621db8b640deaedd8942bb8519010f3


[3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503#c6


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Qing Zhao


> On Oct 24, 2023, at 7:56 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-10-24 18:51, Qing Zhao wrote:
>> Thanks for the proposal!
>> So what you suggested is:
>> For every x.buf,  change it as a __builtin_with_size(x.buf, x.L) in the FE, 
>> then the call to the _bdos (x.buf, 1) will
>> Become:
>>_bdos(__builtin_with_size(x.buf, x.L), 1)?
>> Then the implicit use of x.L in _bdos(x.buf.1) will become explicit?
> 
> Oops, I think Martin and I fell off-list in a subthread.  I clarified that my 
> comment was that any such annotation at object reference is probably too late 
> and hence not the right place for it; basically it has the same problems as 
> the option A in your comment.  A better place to reinforce such a 
> relationship would be the allocation+initialization site instead.

I think Martin’s proposal might work, it’s different than the option A:

A.  Add an additional argument, the size parameter,  to __bdos, 
 A.1, during FE;
 A.2, during gimplification phase;

Option A targets on the __bdos call, try to encode the implicit use to the 
call, this will not work when the real object has not been instantiation at the 
call site.

However, Martin’s proposal targets on the FMA array itself, it will enhance the 
FAM access naturally with the size information. And such FAM access with size 
info will propagated to the __bdos site later through inlining, etc. and then 
tree-object-size can use the size information at that point. At the same time, 
the implicit use of the size is recorded correctly. 

So, I think that this proposal is natural and reasonable.

Qing
> 
> Thanks,
> Sid



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Richard Biener



> Am 25.10.2023 um 12:47 schrieb Martin Uecker :
> 
> Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
>>> On 2023-10-25 04:16, Martin Uecker wrote:
>>> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
 
> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
> 
> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>> Hi, Sid,
>> 
>> Really appreciate for your example and detailed explanation. Very 
>> helpful.
>> I think that this example is an excellent example to show (almost) all 
>> the issues we need to consider.
>> 
>> I slightly modified this example to make it to be compilable and 
>> run-able, as following:
>> (but I still cannot make the incorrect reordering or DSE happening, 
>> anyway, the potential reordering possibility is there…)
>> 
>>  1 #include 
>>  2 struct A
>>  3 {
>>  4  size_t size;
>>  5  char buf[] __attribute__((counted_by(size)));
>>  6 };
>>  7
>>  8 static size_t
>>  9 get_size_from (void *ptr)
>> 10 {
>> 11  return __builtin_dynamic_object_size (ptr, 1);
>> 12 }
>> 13
>> 14 void
>> 15 foo (size_t sz)
>> 16 {
>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
>> sizeof(char));
>> 18  obj->size = sz;
>> 19  obj->buf[0] = 2;
>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>> 21  return;
>> 22 }
>> 23
>> 24 int main ()
>> 25 {
>> 26  foo (20);
>> 27  return 0;
>> 28 }
>> 
>> 
>> 
>> 
 When it’s set I suppose.  Turn
 
 X.l = n;
 
 Into
 
 X.l = __builtin_with_size (x.buf, n);
>>> 
>>> It would turn
>>> 
>>> some_variable = (&) x.buf
>>> 
>>> into
>>> 
>>> some_variable = __builtin_with_size ( (&) x.buf. x.len)
>>> 
>>> 
>>> So the later access to x.buf and not the initialization
>>> of a member of the struct (which is too early).
>>> 
>> 
>> Hmm, so with Qing's example above, are you suggesting the transformation 
>> be to foo like so:
>> 
>> 14 void
>> 15 foo (size_t sz)
>> 16 {
>> 16.5  void * _1;
>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>> 18  obj->size = sz;
>> 19  obj->buf[0] = 2;
>> 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
>> 20  __builtin_printf (“%d\n", get_size_from (_1));
>> 21  return;
>> 22 }
>> 
>> If yes then this could indeed work.  I think I got thrown off by the 
>> reference to __bdos.
> 
> Yes. I think it is important not to evaluate the size at the
> access to buf and not the allocation, because the point is to 
> recover it from the size member even when the compiler can't 
> see the original allocation.

But if the access is through a pointer without the attribute visible even the 
Frontend cannot recover?  We’d need to force type correctness and give up on 
indirecting through an int * when it can refer to two diffenent container 
types.  The best we can do I think is mark allocation sites and hope for some 
basic code hygiene (not clobbering size or array pointer through pointers 
without the appropriately attributed type)

> Evaluating at this point requires that the size is correctly set
> before the access to the FAM and the user has to make sure 
> this is the case. But to me this requirement would make sense.
> 
> Semantically, it could aöso make sense to evaluate the size at a
> later time.  But then the reordering becomes problematic again.
> 
> Also I think this would make this feature generally more useful.
> For example, it could work also for others pointers in the struct
> and not just for FAMs.  In this case, the struct may already be
> freed when  BDOS is called, so it might also not possible to
> access the size member at a later time.
> 
> Martin
> 
> 
>> 
> 


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Martin Uecker
Am Mittwoch, dem 25.10.2023 um 06:25 -0400 schrieb Siddhesh Poyarekar:
> On 2023-10-25 04:16, Martin Uecker wrote:
> > Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
> > > 
> > > > Am 24.10.2023 um 22:38 schrieb Martin Uecker :
> > > > 
> > > > Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
> > > > > Hi, Sid,
> > > > > 
> > > > > Really appreciate for your example and detailed explanation. Very 
> > > > > helpful.
> > > > > I think that this example is an excellent example to show (almost) 
> > > > > all the issues we need to consider.
> > > > > 
> > > > > I slightly modified this example to make it to be compilable and 
> > > > > run-able, as following:
> > > > > (but I still cannot make the incorrect reordering or DSE happening, 
> > > > > anyway, the potential reordering possibility is there…)
> > > > > 
> > > > >   1 #include 
> > > > >   2 struct A
> > > > >   3 {
> > > > >   4  size_t size;
> > > > >   5  char buf[] __attribute__((counted_by(size)));
> > > > >   6 };
> > > > >   7
> > > > >   8 static size_t
> > > > >   9 get_size_from (void *ptr)
> > > > > 10 {
> > > > > 11  return __builtin_dynamic_object_size (ptr, 1);
> > > > > 12 }
> > > > > 13
> > > > > 14 void
> > > > > 15 foo (size_t sz)
> > > > > 16 {
> > > > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> > > > > sizeof(char));
> > > > > 18  obj->size = sz;
> > > > > 19  obj->buf[0] = 2;
> > > > > 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
> > > > > 21  return;
> > > > > 22 }
> > > > > 23
> > > > > 24 int main ()
> > > > > 25 {
> > > > > 26  foo (20);
> > > > > 27  return 0;
> > > > > 28 }
> > > > > 
> 
> 
> 
> > > When it’s set I suppose.  Turn
> > > 
> > > X.l = n;
> > > 
> > > Into
> > > 
> > > X.l = __builtin_with_size (x.buf, n);
> > 
> > It would turn
> > 
> > some_variable = (&) x.buf
> > 
> > into
> > 
> > some_variable = __builtin_with_size ( (&) x.buf. x.len)
> > 
> > 
> > So the later access to x.buf and not the initialization
> > of a member of the struct (which is too early).
> > 
> 
> Hmm, so with Qing's example above, are you suggesting the transformation 
> be to foo like so:
> 
> 14 void
> 15 foo (size_t sz)
> 16 {
> 16.5  void * _1;
> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
> 18  obj->size = sz;
> 19  obj->buf[0] = 2;
> 19.5  _1 = __builtin_with_size (obj->buf, obj->size);
> 20  __builtin_printf (“%d\n", get_size_from (_1));
> 21  return;
> 22 }
> 
> If yes then this could indeed work.  I think I got thrown off by the 
> reference to __bdos.

Yes. I think it is important not to evaluate the size at the
access to buf and not the allocation, because the point is to 
recover it from the size member even when the compiler can't 
see the original allocation.

Evaluating at this point requires that the size is correctly set
before the access to the FAM and the user has to make sure 
this is the case. But to me this requirement would make sense.

Semantically, it could aöso make sense to evaluate the size at a
later time.  But then the reordering becomes problematic again.

Also I think this would make this feature generally more useful.
For example, it could work also for others pointers in the struct
and not just for FAMs.  In this case, the struct may already be
freed when  BDOS is called, so it might also not possible to
access the size member at a later time.

Martin


> 



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Martin Uecker
Am Mittwoch, dem 25.10.2023 um 12:25 +0200 schrieb Richard Biener:
> 
> > Am 25.10.2023 um 10:16 schrieb Martin Uecker :
> > 
> > Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
> > > 
> > > > > Am 24.10.2023 um 22:38 schrieb Martin Uecker :
> > > > 
> > > > Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
> > > > > Hi, Sid,
> > > > > 
> > > > > Really appreciate for your example and detailed explanation. Very 
> > > > > helpful.
> > > > > I think that this example is an excellent example to show (almost) 
> > > > > all the issues we need to consider.
> > > > > 
> > > > > I slightly modified this example to make it to be compilable and 
> > > > > run-able, as following: 
> > > > > (but I still cannot make the incorrect reordering or DSE happening, 
> > > > > anyway, the potential reordering possibility is there…)
> > > > > 
> > > > > 1 #include 
> > > > > 2 struct A
> > > > > 3 {
> > > > > 4  size_t size;
> > > > > 5  char buf[] __attribute__((counted_by(size)));
> > > > > 6 };
> > > > > 7 
> > > > > 8 static size_t
> > > > > 9 get_size_from (void *ptr)
> > > > > 10 {
> > > > > 11  return __builtin_dynamic_object_size (ptr, 1);
> > > > > 12 }
> > > > > 13 
> > > > > 14 void
> > > > > 15 foo (size_t sz)
> > > > > 16 {
> > > > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> > > > > sizeof(char));
> > > > > 18  obj->size = sz;
> > > > > 19  obj->buf[0] = 2;
> > > > > 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
> > > > > 21  return;
> > > > > 22 }
> > > > > 23 
> > > > > 24 int main ()
> > > > > 25 {
> > > > > 26  foo (20);
> > > > > 27  return 0;
> > > > > 28 }
> > > > > 
> > > > > With my GCC, it was compiled and worked:
> > > > > [opc@qinzhao-ol8u3-x86 ]$  /home/opc/Install/latest-d/bin/gcc -O1 t5.c
> > > > > [opc@qinzhao-ol8u3-x86 ]$ ./a.out
> > > > > 20
> > > > > Situation 1: With O1 and above, the routine “get_size_from” was 
> > > > > inlined into “foo”, therefore, the call to __bdos is in the same 
> > > > > routine as the instantiation of the object, and the TYPE information 
> > > > > and the attached counted_by attribute information in the TYPE of the 
> > > > > object can be USED by the __bdos call to compute the final object 
> > > > > size. 
> > > > > 
> > > > > [opc@qinzhao-ol8u3-x86]$  /home/opc/Install/latest-d/bin/gcc -O0  t5.c
> > > > > [opc@qinzhao-ol8u3-x86 ]$ ./a.out
> > > > > -1
> > > > > Situation 2: With O0, the routine “get_size_from” was NOT inlined 
> > > > > into “foo”, therefore, the call to __bdos is Not in the same routine 
> > > > > as the instantiation of the object, As a result, the TYPE info and 
> > > > > the attached counted_by info of the object can NOT be USED by the 
> > > > > __bdos call. 
> > > > > 
> > > > > Keep in mind of the above 2 situations, we will refer them in below:
> > > > > 
> > > > > 1. First,  the problem we are trying to resolve is:
> > > > > 
> > > > > (Your description):
> > > > > 
> > > > > > the reordering of __bdos w.r.t. initialization of the size 
> > > > > > parameter but to also account for DSE of the assignment, we can 
> > > > > > abstract this problem to that of DFA being unable to see implicit 
> > > > > > use of the size parameter in the __bdos call.
> > > > > 
> > > > > basically is correct.  However, with the following exception:
> > > > > 
> > > > > The implicit use of the size parameter in the __bdos call is not 
> > > > > always there, it ONLY exists WHEN the __bdos is able to evaluated to 
> > > > > an expression of the size parameter in the “objsz” phase, i.e., the 
> > > > > “Situation 1” of the above example. 
> > > > > In the “Situation 2”, when the __bdos does not see the TYPE of the 
> > > > > real object,  it does not see the counted_by information from the 
> > > > > TYPE, therefore,  it is not able to evaluate the size of the object 
> > > > > through the counted_by information.  As a result, the implicit use of 
> > > > > the size parameter in the __bdos call does NOT exist at all.  The 
> > > > > optimizer can freely reorder the initialization of the size parameter 
> > > > > with the __bdos call since there is no data flow dependency between 
> > > > > these two. 
> > > > > 
> > > > > With this exception in mind, we can see that your proposed “option 2” 
> > > > > (making the type of size “volatile”) is too conservative, it will  
> > > > > disable many optimizations  unnecessarily, even though it’s safe and 
> > > > > simple to implement. 
> > > > > 
> > > > > As a compiler optimization person for many many years, I really don’t 
> > > > > want to take this approach at this moment.  -:)
> > > > > 
> > > > > 2. Some facts I’d like to mention:
> > > > > 
> > > > > A.  The incorrect reordering (or CSE) potential ONLY exists in the 
> > > > > TREE optimization stage. During RTL stage,  the __bdos call has 
> > > > > already been replaced by an expression of the size parameter or a 
> > > > > constant, the data dependency is explicitly in the IR already.  I 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Richard Biener



> Am 25.10.2023 um 10:16 schrieb Martin Uecker :
> 
> Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
>> 
 Am 24.10.2023 um 22:38 schrieb Martin Uecker :
>>> 
>>> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
 Hi, Sid,
 
 Really appreciate for your example and detailed explanation. Very helpful.
 I think that this example is an excellent example to show (almost) all the 
 issues we need to consider.
 
 I slightly modified this example to make it to be compilable and run-able, 
 as following: 
 (but I still cannot make the incorrect reordering or DSE happening, 
 anyway, the potential reordering possibility is there…)
 
 1 #include 
 2 struct A
 3 {
 4  size_t size;
 5  char buf[] __attribute__((counted_by(size)));
 6 };
 7 
 8 static size_t
 9 get_size_from (void *ptr)
 10 {
 11  return __builtin_dynamic_object_size (ptr, 1);
 12 }
 13 
 14 void
 15 foo (size_t sz)
 16 {
 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
 sizeof(char));
 18  obj->size = sz;
 19  obj->buf[0] = 2;
 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
 21  return;
 22 }
 23 
 24 int main ()
 25 {
 26  foo (20);
 27  return 0;
 28 }
 
 With my GCC, it was compiled and worked:
 [opc@qinzhao-ol8u3-x86 ]$  /home/opc/Install/latest-d/bin/gcc -O1 t5.c
 [opc@qinzhao-ol8u3-x86 ]$ ./a.out
 20
 Situation 1: With O1 and above, the routine “get_size_from” was inlined 
 into “foo”, therefore, the call to __bdos is in the same routine as the 
 instantiation of the object, and the TYPE information and the attached 
 counted_by attribute information in the TYPE of the object can be USED by 
 the __bdos call to compute the final object size. 
 
 [opc@qinzhao-ol8u3-x86]$  /home/opc/Install/latest-d/bin/gcc -O0  t5.c
 [opc@qinzhao-ol8u3-x86 ]$ ./a.out
 -1
 Situation 2: With O0, the routine “get_size_from” was NOT inlined into 
 “foo”, therefore, the call to __bdos is Not in the same routine as the 
 instantiation of the object, As a result, the TYPE info and the attached 
 counted_by info of the object can NOT be USED by the __bdos call. 
 
 Keep in mind of the above 2 situations, we will refer them in below:
 
 1. First,  the problem we are trying to resolve is:
 
 (Your description):
 
> the reordering of __bdos w.r.t. initialization of the size parameter but 
> to also account for DSE of the assignment, we can abstract this problem 
> to that of DFA being unable to see implicit use of the size parameter in 
> the __bdos call.
 
 basically is correct.  However, with the following exception:
 
 The implicit use of the size parameter in the __bdos call is not always 
 there, it ONLY exists WHEN the __bdos is able to evaluated to an 
 expression of the size parameter in the “objsz” phase, i.e., the 
 “Situation 1” of the above example. 
 In the “Situation 2”, when the __bdos does not see the TYPE of the real 
 object,  it does not see the counted_by information from the TYPE, 
 therefore,  it is not able to evaluate the size of the object through the 
 counted_by information.  As a result, the implicit use of the size 
 parameter in the __bdos call does NOT exist at all.  The optimizer can 
 freely reorder the initialization of the size parameter with the __bdos 
 call since there is no data flow dependency between these two. 
 
 With this exception in mind, we can see that your proposed “option 2” 
 (making the type of size “volatile”) is too conservative, it will  disable 
 many optimizations  unnecessarily, even though it’s safe and simple to 
 implement. 
 
 As a compiler optimization person for many many years, I really don’t want 
 to take this approach at this moment.  -:)
 
 2. Some facts I’d like to mention:
 
 A.  The incorrect reordering (or CSE) potential ONLY exists in the TREE 
 optimization stage. During RTL stage,  the __bdos call has already been 
 replaced by an expression of the size parameter or a constant, the data 
 dependency is explicitly in the IR already.  I believe that the data 
 analysis in RTL stage should pick up the data dependency correctly, No 
 special handling is needed in RTL.
 
 B. If the __bdos call cannot see the real object , it has no way to get 
 the “counted_by” field from the TYPE of the real object. So, if we try to 
 add the implicit use of the “counted_by” field to the __bdos call, the 
 object instantiation should be in the same routine as the __bdos call.  
 Both the FE and the gimplification phase are too early to do this work. 
 
 2. Then, what’s the best approach to resolve this problem:
 
 There were 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Siddhesh Poyarekar

On 2023-10-25 04:16, Martin Uecker wrote:

Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:



Am 24.10.2023 um 22:38 schrieb Martin Uecker :

Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:

Hi, Sid,

Really appreciate for your example and detailed explanation. Very helpful.
I think that this example is an excellent example to show (almost) all the 
issues we need to consider.

I slightly modified this example to make it to be compilable and run-able, as 
following:
(but I still cannot make the incorrect reordering or DSE happening, anyway, the 
potential reordering possibility is there…)

  1 #include 
  2 struct A
  3 {
  4  size_t size;
  5  char buf[] __attribute__((counted_by(size)));
  6 };
  7
  8 static size_t
  9 get_size_from (void *ptr)
10 {
11  return __builtin_dynamic_object_size (ptr, 1);
12 }
13
14 void
15 foo (size_t sz)
16 {
17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
18  obj->size = sz;
19  obj->buf[0] = 2;
20  __builtin_printf (“%d\n", get_size_from (obj->buf));
21  return;
22 }
23
24 int main ()
25 {
26  foo (20);
27  return 0;
28 }






When it’s set I suppose.  Turn

X.l = n;

Into

X.l = __builtin_with_size (x.buf, n);


It would turn

some_variable = (&) x.buf

into

some_variable = __builtin_with_size ( (&) x.buf. x.len)


So the later access to x.buf and not the initialization
of a member of the struct (which is too early).



Hmm, so with Qing's example above, are you suggesting the transformation 
be to foo like so:


14 void
15 foo (size_t sz)
16 {
16.5  void * _1;
17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
18  obj->size = sz;
19  obj->buf[0] = 2;
19.5  _1 = __builtin_with_size (obj->buf, obj->size);
20  __builtin_printf (“%d\n", get_size_from (_1));
21  return;
22 }

If yes then this could indeed work.  I think I got thrown off by the 
reference to __bdos.


Thanks,
Sid


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Martin Uecker
Am Mittwoch, dem 25.10.2023 um 08:43 +0200 schrieb Richard Biener:
> 
> > Am 24.10.2023 um 22:38 schrieb Martin Uecker :
> > 
> > Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
> > > Hi, Sid,
> > > 
> > > Really appreciate for your example and detailed explanation. Very helpful.
> > > I think that this example is an excellent example to show (almost) all 
> > > the issues we need to consider.
> > > 
> > > I slightly modified this example to make it to be compilable and 
> > > run-able, as following: 
> > > (but I still cannot make the incorrect reordering or DSE happening, 
> > > anyway, the potential reordering possibility is there…)
> > > 
> > >  1 #include 
> > >  2 struct A
> > >  3 {
> > >  4  size_t size;
> > >  5  char buf[] __attribute__((counted_by(size)));
> > >  6 };
> > >  7 
> > >  8 static size_t
> > >  9 get_size_from (void *ptr)
> > > 10 {
> > > 11  return __builtin_dynamic_object_size (ptr, 1);
> > > 12 }
> > > 13 
> > > 14 void
> > > 15 foo (size_t sz)
> > > 16 {
> > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> > > sizeof(char));
> > > 18  obj->size = sz;
> > > 19  obj->buf[0] = 2;
> > > 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
> > > 21  return;
> > > 22 }
> > > 23 
> > > 24 int main ()
> > > 25 {
> > > 26  foo (20);
> > > 27  return 0;
> > > 28 }
> > > 
> > > With my GCC, it was compiled and worked:
> > > [opc@qinzhao-ol8u3-x86 ]$  /home/opc/Install/latest-d/bin/gcc -O1 t5.c
> > > [opc@qinzhao-ol8u3-x86 ]$ ./a.out
> > > 20
> > > Situation 1: With O1 and above, the routine “get_size_from” was inlined 
> > > into “foo”, therefore, the call to __bdos is in the same routine as the 
> > > instantiation of the object, and the TYPE information and the attached 
> > > counted_by attribute information in the TYPE of the object can be USED by 
> > > the __bdos call to compute the final object size. 
> > > 
> > > [opc@qinzhao-ol8u3-x86]$  /home/opc/Install/latest-d/bin/gcc -O0  t5.c
> > > [opc@qinzhao-ol8u3-x86 ]$ ./a.out
> > > -1
> > > Situation 2: With O0, the routine “get_size_from” was NOT inlined into 
> > > “foo”, therefore, the call to __bdos is Not in the same routine as the 
> > > instantiation of the object, As a result, the TYPE info and the attached 
> > > counted_by info of the object can NOT be USED by the __bdos call. 
> > > 
> > > Keep in mind of the above 2 situations, we will refer them in below:
> > > 
> > > 1. First,  the problem we are trying to resolve is:
> > > 
> > > (Your description):
> > > 
> > > > the reordering of __bdos w.r.t. initialization of the size parameter 
> > > > but to also account for DSE of the assignment, we can abstract this 
> > > > problem to that of DFA being unable to see implicit use of the size 
> > > > parameter in the __bdos call.
> > > 
> > > basically is correct.  However, with the following exception:
> > > 
> > > The implicit use of the size parameter in the __bdos call is not always 
> > > there, it ONLY exists WHEN the __bdos is able to evaluated to an 
> > > expression of the size parameter in the “objsz” phase, i.e., the 
> > > “Situation 1” of the above example. 
> > > In the “Situation 2”, when the __bdos does not see the TYPE of the real 
> > > object,  it does not see the counted_by information from the TYPE, 
> > > therefore,  it is not able to evaluate the size of the object through the 
> > > counted_by information.  As a result, the implicit use of the size 
> > > parameter in the __bdos call does NOT exist at all.  The optimizer can 
> > > freely reorder the initialization of the size parameter with the __bdos 
> > > call since there is no data flow dependency between these two. 
> > > 
> > > With this exception in mind, we can see that your proposed “option 2” 
> > > (making the type of size “volatile”) is too conservative, it will  
> > > disable many optimizations  unnecessarily, even though it’s safe and 
> > > simple to implement. 
> > > 
> > > As a compiler optimization person for many many years, I really don’t 
> > > want to take this approach at this moment.  -:)
> > > 
> > > 2. Some facts I’d like to mention:
> > > 
> > > A.  The incorrect reordering (or CSE) potential ONLY exists in the TREE 
> > > optimization stage. During RTL stage,  the __bdos call has already been 
> > > replaced by an expression of the size parameter or a constant, the data 
> > > dependency is explicitly in the IR already.  I believe that the data 
> > > analysis in RTL stage should pick up the data dependency correctly, No 
> > > special handling is needed in RTL.
> > > 
> > > B. If the __bdos call cannot see the real object , it has no way to get 
> > > the “counted_by” field from the TYPE of the real object. So, if we try to 
> > > add the implicit use of the “counted_by” field to the __bdos call, the 
> > > object instantiation should be in the same routine as the __bdos call.  
> > > Both the FE and the gimplification phase are too early to do this work. 
> > > 
> > > 2. Then, what’s the 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-25 Thread Richard Biener



> Am 24.10.2023 um 22:38 schrieb Martin Uecker :
> 
> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>> Hi, Sid,
>> 
>> Really appreciate for your example and detailed explanation. Very helpful.
>> I think that this example is an excellent example to show (almost) all the 
>> issues we need to consider.
>> 
>> I slightly modified this example to make it to be compilable and run-able, 
>> as following: 
>> (but I still cannot make the incorrect reordering or DSE happening, anyway, 
>> the potential reordering possibility is there…)
>> 
>>  1 #include 
>>  2 struct A
>>  3 {
>>  4  size_t size;
>>  5  char buf[] __attribute__((counted_by(size)));
>>  6 };
>>  7 
>>  8 static size_t
>>  9 get_size_from (void *ptr)
>> 10 {
>> 11  return __builtin_dynamic_object_size (ptr, 1);
>> 12 }
>> 13 
>> 14 void
>> 15 foo (size_t sz)
>> 16 {
>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>> 18  obj->size = sz;
>> 19  obj->buf[0] = 2;
>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>> 21  return;
>> 22 }
>> 23 
>> 24 int main ()
>> 25 {
>> 26  foo (20);
>> 27  return 0;
>> 28 }
>> 
>> With my GCC, it was compiled and worked:
>> [opc@qinzhao-ol8u3-x86 ]$  /home/opc/Install/latest-d/bin/gcc -O1 t5.c
>> [opc@qinzhao-ol8u3-x86 ]$ ./a.out
>> 20
>> Situation 1: With O1 and above, the routine “get_size_from” was inlined into 
>> “foo”, therefore, the call to __bdos is in the same routine as the 
>> instantiation of the object, and the TYPE information and the attached 
>> counted_by attribute information in the TYPE of the object can be USED by 
>> the __bdos call to compute the final object size. 
>> 
>> [opc@qinzhao-ol8u3-x86]$  /home/opc/Install/latest-d/bin/gcc -O0  t5.c
>> [opc@qinzhao-ol8u3-x86 ]$ ./a.out
>> -1
>> Situation 2: With O0, the routine “get_size_from” was NOT inlined into 
>> “foo”, therefore, the call to __bdos is Not in the same routine as the 
>> instantiation of the object, As a result, the TYPE info and the attached 
>> counted_by info of the object can NOT be USED by the __bdos call. 
>> 
>> Keep in mind of the above 2 situations, we will refer them in below:
>> 
>> 1. First,  the problem we are trying to resolve is:
>> 
>> (Your description):
>> 
>>> the reordering of __bdos w.r.t. initialization of the size parameter but to 
>>> also account for DSE of the assignment, we can abstract this problem to 
>>> that of DFA being unable to see implicit use of the size parameter in the 
>>> __bdos call.
>> 
>> basically is correct.  However, with the following exception:
>> 
>> The implicit use of the size parameter in the __bdos call is not always 
>> there, it ONLY exists WHEN the __bdos is able to evaluated to an expression 
>> of the size parameter in the “objsz” phase, i.e., the “Situation 1” of the 
>> above example. 
>> In the “Situation 2”, when the __bdos does not see the TYPE of the real 
>> object,  it does not see the counted_by information from the TYPE, 
>> therefore,  it is not able to evaluate the size of the object through the 
>> counted_by information.  As a result, the implicit use of the size parameter 
>> in the __bdos call does NOT exist at all.  The optimizer can freely reorder 
>> the initialization of the size parameter with the __bdos call since there is 
>> no data flow dependency between these two. 
>> 
>> With this exception in mind, we can see that your proposed “option 2” 
>> (making the type of size “volatile”) is too conservative, it will  disable 
>> many optimizations  unnecessarily, even though it’s safe and simple to 
>> implement. 
>> 
>> As a compiler optimization person for many many years, I really don’t want 
>> to take this approach at this moment.  -:)
>> 
>> 2. Some facts I’d like to mention:
>> 
>> A.  The incorrect reordering (or CSE) potential ONLY exists in the TREE 
>> optimization stage. During RTL stage,  the __bdos call has already been 
>> replaced by an expression of the size parameter or a constant, the data 
>> dependency is explicitly in the IR already.  I believe that the data 
>> analysis in RTL stage should pick up the data dependency correctly, No 
>> special handling is needed in RTL.
>> 
>> B. If the __bdos call cannot see the real object , it has no way to get the 
>> “counted_by” field from the TYPE of the real object. So, if we try to add 
>> the implicit use of the “counted_by” field to the __bdos call, the object 
>> instantiation should be in the same routine as the __bdos call.  Both the FE 
>> and the gimplification phase are too early to do this work. 
>> 
>> 2. Then, what’s the best approach to resolve this problem:
>> 
>> There were several suggestions so far:
>> 
>> A.  Add an additional argument, the size parameter,  to __bdos, 
>>  A.1, during FE;
>>  A.2, during gimplification phase;
>> B.  Encode the implicit USE  in the type of size, to make the size 
>> “volatile”;
>> C.  Encode the implicit USE  in the type of buf, then update the 
>> optimization passes to 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Martin Uecker
Am Dienstag, dem 24.10.2023 um 22:51 + schrieb Qing Zhao:
> 
> > On Oct 24, 2023, at 4:38 PM, Martin Uecker  wrote:
> > 
> > Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
> > > Hi, Sid,
> > > 
> > > Really appreciate for your example and detailed explanation. Very helpful.
> > > I think that this example is an excellent example to show (almost) all 
> > > the issues we need to consider.
> > > 
> > > I slightly modified this example to make it to be compilable and 
> > > run-able, as following: 
> > > (but I still cannot make the incorrect reordering or DSE happening, 
> > > anyway, the potential reordering possibility is there…)
> > > 
> > >  1 #include 
> > >  2 struct A
> > >  3 {
> > >  4  size_t size;
> > >  5  char buf[] __attribute__((counted_by(size)));
> > >  6 };
> > >  7 
> > >  8 static size_t
> > >  9 get_size_from (void *ptr)
> > > 10 {
> > > 11  return __builtin_dynamic_object_size (ptr, 1);
> > > 12 }
> > > 13 
> > > 14 void
> > > 15 foo (size_t sz)
> > > 16 {
> > > 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * 
> > > sizeof(char));
> > > 18  obj->size = sz;
> > > 19  obj->buf[0] = 2;
> > > 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
> > > 21  return;
> > > 22 }
> > > 23 
> > > 24 int main ()
> > > 25 {
> > > 26  foo (20);
> > > 27  return 0;
> > > 28 }
> > > 
> > > With my GCC, it was compiled and worked:
> > > [opc@qinzhao-ol8u3-x86 ]$  /home/opc/Install/latest-d/bin/gcc -O1 t5.c
> > > [opc@qinzhao-ol8u3-x86 ]$ ./a.out
> > > 20
> > > Situation 1: With O1 and above, the routine “get_size_from” was inlined 
> > > into “foo”, therefore, the call to __bdos is in the same routine as the 
> > > instantiation of the object, and the TYPE information and the attached 
> > > counted_by attribute information in the TYPE of the object can be USED by 
> > > the __bdos call to compute the final object size. 
> > > 
> > > [opc@qinzhao-ol8u3-x86]$  /home/opc/Install/latest-d/bin/gcc -O0  t5.c
> > > [opc@qinzhao-ol8u3-x86 ]$ ./a.out
> > > -1
> > > Situation 2: With O0, the routine “get_size_from” was NOT inlined into 
> > > “foo”, therefore, the call to __bdos is Not in the same routine as the 
> > > instantiation of the object, As a result, the TYPE info and the attached 
> > > counted_by info of the object can NOT be USED by the __bdos call. 
> > > 
> > > Keep in mind of the above 2 situations, we will refer them in below:
> > > 
> > > 1. First,  the problem we are trying to resolve is:
> > > 
> > > (Your description):
> > > 
> > > > the reordering of __bdos w.r.t. initialization of the size parameter 
> > > > but to also account for DSE of the assignment, we can abstract this 
> > > > problem to that of DFA being unable to see implicit use of the size 
> > > > parameter in the __bdos call.
> > > 
> > > basically is correct.  However, with the following exception:
> > > 
> > > The implicit use of the size parameter in the __bdos call is not always 
> > > there, it ONLY exists WHEN the __bdos is able to evaluated to an 
> > > expression of the size parameter in the “objsz” phase, i.e., the 
> > > “Situation 1” of the above example. 
> > > In the “Situation 2”, when the __bdos does not see the TYPE of the real 
> > > object,  it does not see the counted_by information from the TYPE, 
> > > therefore,  it is not able to evaluate the size of the object through the 
> > > counted_by information.  As a result, the implicit use of the size 
> > > parameter in the __bdos call does NOT exist at all.  The optimizer can 
> > > freely reorder the initialization of the size parameter with the __bdos 
> > > call since there is no data flow dependency between these two. 
> > > 
> > > With this exception in mind, we can see that your proposed “option 2” 
> > > (making the type of size “volatile”) is too conservative, it will  
> > > disable many optimizations  unnecessarily, even though it’s safe and 
> > > simple to implement. 
> > > 
> > > As a compiler optimization person for many many years, I really don’t 
> > > want to take this approach at this moment.  -:)
> > > 
> > > 2. Some facts I’d like to mention:
> > > 
> > > A.  The incorrect reordering (or CSE) potential ONLY exists in the TREE 
> > > optimization stage. During RTL stage,  the __bdos call has already been 
> > > replaced by an expression of the size parameter or a constant, the data 
> > > dependency is explicitly in the IR already.  I believe that the data 
> > > analysis in RTL stage should pick up the data dependency correctly, No 
> > > special handling is needed in RTL.
> > > 
> > > B. If the __bdos call cannot see the real object , it has no way to get 
> > > the “counted_by” field from the TYPE of the real object. So, if we try to 
> > > add the implicit use of the “counted_by” field to the __bdos call, the 
> > > object instantiation should be in the same routine as the __bdos call.  
> > > Both the FE and the gimplification phase are too early to do this work. 
> > > 
> > > 2. Then, what’s the 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Siddhesh Poyarekar

On 2023-10-24 18:51, Qing Zhao wrote:

Thanks for the proposal!

So what you suggested is:

For every x.buf,  change it as a __builtin_with_size(x.buf, x.L) in the FE, 
then the call to the _bdos (x.buf, 1) will
Become:

_bdos(__builtin_with_size(x.buf, x.L), 1)?

Then the implicit use of x.L in _bdos(x.buf.1) will become explicit?


Oops, I think Martin and I fell off-list in a subthread.  I clarified 
that my comment was that any such annotation at object reference is 
probably too late and hence not the right place for it; basically it has 
the same problems as the option A in your comment.  A better place to 
reinforce such a relationship would be the allocation+initialization 
site instead.


Thanks,
Sid


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Siddhesh Poyarekar

On 2023-10-24 18:41, Qing Zhao wrote:




On Oct 24, 2023, at 5:03 PM, Siddhesh Poyarekar  wrote:

On 2023-10-24 16:30, Qing Zhao wrote:

Situation 2: With O0, the routine “get_size_from” was NOT inlined into “foo”, 
therefore, the call to __bdos is Not in the same routine as the instantiation 
of the object, As a result, the TYPE info and the attached counted_by info of 
the object can NOT be USED by the __bdos call.


But __bos/__bdos are barely useful without optimization; you need a minimum of 
-O1.  You're right that if the call is never inlined then we don't care because 
the __bdos call does not get expanded to obj->size.

However, the point of situation 2 is that the TYPE info cannot be used by the 
__bdos call *only for a while* (i.e. until the call gets inlined) and that 
window is an opportunity for the reordering/DSE to break things.


The main point of situation 2 I tried made: there are situations where 
obj->size is not used at all by the __bdos, marking it as volatile is too 
conservative, unnecessarily prevent useful optimizations from happening.  -:)


Yes, that's the tradeoff.  However, maybe this is the point where Kees 
jumps in and say the kernel doesn't really care as much or something 
like that :)


Sid


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Qing Zhao


> On Oct 24, 2023, at 4:38 PM, Martin Uecker  wrote:
> 
> Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
>> Hi, Sid,
>> 
>> Really appreciate for your example and detailed explanation. Very helpful.
>> I think that this example is an excellent example to show (almost) all the 
>> issues we need to consider.
>> 
>> I slightly modified this example to make it to be compilable and run-able, 
>> as following: 
>> (but I still cannot make the incorrect reordering or DSE happening, anyway, 
>> the potential reordering possibility is there…)
>> 
>>  1 #include 
>>  2 struct A
>>  3 {
>>  4  size_t size;
>>  5  char buf[] __attribute__((counted_by(size)));
>>  6 };
>>  7 
>>  8 static size_t
>>  9 get_size_from (void *ptr)
>> 10 {
>> 11  return __builtin_dynamic_object_size (ptr, 1);
>> 12 }
>> 13 
>> 14 void
>> 15 foo (size_t sz)
>> 16 {
>> 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>> 18  obj->size = sz;
>> 19  obj->buf[0] = 2;
>> 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>> 21  return;
>> 22 }
>> 23 
>> 24 int main ()
>> 25 {
>> 26  foo (20);
>> 27  return 0;
>> 28 }
>> 
>> With my GCC, it was compiled and worked:
>> [opc@qinzhao-ol8u3-x86 ]$  /home/opc/Install/latest-d/bin/gcc -O1 t5.c
>> [opc@qinzhao-ol8u3-x86 ]$ ./a.out
>> 20
>> Situation 1: With O1 and above, the routine “get_size_from” was inlined into 
>> “foo”, therefore, the call to __bdos is in the same routine as the 
>> instantiation of the object, and the TYPE information and the attached 
>> counted_by attribute information in the TYPE of the object can be USED by 
>> the __bdos call to compute the final object size. 
>> 
>> [opc@qinzhao-ol8u3-x86]$  /home/opc/Install/latest-d/bin/gcc -O0  t5.c
>> [opc@qinzhao-ol8u3-x86 ]$ ./a.out
>> -1
>> Situation 2: With O0, the routine “get_size_from” was NOT inlined into 
>> “foo”, therefore, the call to __bdos is Not in the same routine as the 
>> instantiation of the object, As a result, the TYPE info and the attached 
>> counted_by info of the object can NOT be USED by the __bdos call. 
>> 
>> Keep in mind of the above 2 situations, we will refer them in below:
>> 
>> 1. First,  the problem we are trying to resolve is:
>> 
>> (Your description):
>> 
>>> the reordering of __bdos w.r.t. initialization of the size parameter but to 
>>> also account for DSE of the assignment, we can abstract this problem to 
>>> that of DFA being unable to see implicit use of the size parameter in the 
>>> __bdos call.
>> 
>> basically is correct.  However, with the following exception:
>> 
>> The implicit use of the size parameter in the __bdos call is not always 
>> there, it ONLY exists WHEN the __bdos is able to evaluated to an expression 
>> of the size parameter in the “objsz” phase, i.e., the “Situation 1” of the 
>> above example. 
>> In the “Situation 2”, when the __bdos does not see the TYPE of the real 
>> object,  it does not see the counted_by information from the TYPE, 
>> therefore,  it is not able to evaluate the size of the object through the 
>> counted_by information.  As a result, the implicit use of the size parameter 
>> in the __bdos call does NOT exist at all.  The optimizer can freely reorder 
>> the initialization of the size parameter with the __bdos call since there is 
>> no data flow dependency between these two. 
>> 
>> With this exception in mind, we can see that your proposed “option 2” 
>> (making the type of size “volatile”) is too conservative, it will  disable 
>> many optimizations  unnecessarily, even though it’s safe and simple to 
>> implement. 
>> 
>> As a compiler optimization person for many many years, I really don’t want 
>> to take this approach at this moment.  -:)
>> 
>> 2. Some facts I’d like to mention:
>> 
>> A.  The incorrect reordering (or CSE) potential ONLY exists in the TREE 
>> optimization stage. During RTL stage,  the __bdos call has already been 
>> replaced by an expression of the size parameter or a constant, the data 
>> dependency is explicitly in the IR already.  I believe that the data 
>> analysis in RTL stage should pick up the data dependency correctly, No 
>> special handling is needed in RTL.
>> 
>> B. If the __bdos call cannot see the real object , it has no way to get the 
>> “counted_by” field from the TYPE of the real object. So, if we try to add 
>> the implicit use of the “counted_by” field to the __bdos call, the object 
>> instantiation should be in the same routine as the __bdos call.  Both the FE 
>> and the gimplification phase are too early to do this work. 
>> 
>> 2. Then, what’s the best approach to resolve this problem:
>> 
>> There were several suggestions so far:
>> 
>> A.  Add an additional argument, the size parameter,  to __bdos, 
>>  A.1, during FE;
>>  A.2, during gimplification phase;
>> B.  Encode the implicit USE  in the type of size, to make the size 
>> “volatile”;
>> C.  Encode the implicit USE  in the type of buf, then update the 
>> optimization passes 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Qing Zhao


> On Oct 24, 2023, at 5:03 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-10-24 16:30, Qing Zhao wrote:
>> Situation 2: With O0, the routine “get_size_from” was NOT inlined into 
>> “foo”, therefore, the call to __bdos is Not in the same routine as the 
>> instantiation of the object, As a result, the TYPE info and the attached 
>> counted_by info of the object can NOT be USED by the __bdos call.
> 
> But __bos/__bdos are barely useful without optimization; you need a minimum 
> of -O1.  You're right that if the call is never inlined then we don't care 
> because the __bdos call does not get expanded to obj->size.
> 
> However, the point of situation 2 is that the TYPE info cannot be used by the 
> __bdos call *only for a while* (i.e. until the call gets inlined) and that 
> window is an opportunity for the reordering/DSE to break things.

The main point of situation 2 I tried made: there are situations where 
obj->size is not used at all by the __bdos, marking it as volatile is too 
conservative, unnecessarily prevent useful optimizations from happening.  -:)

Qing
> 
> Thanks.
> Sid



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Siddhesh Poyarekar

On 2023-10-24 16:38, Martin Uecker wrote:

Here is another proposal:  Add a new builtin function

__builtin_with_size(x, size)

that return x but behaves similar to an allocation
function in that BDOS can look at the size argument
to discover the size.

The FE insers this function when the field is accessed:

__builtin_with_size(x.buf, x.L);



In fact if we do this at the allocation site for x, it may also help 
with future warnings, where the compiler could flag a warning or error 
when it encounters this builtin but does not see an assignment to x.L.


Thanks,
Sid


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Siddhesh Poyarekar

On 2023-10-24 16:30, Qing Zhao wrote:

Situation 2: With O0, the routine “get_size_from” was NOT inlined into “foo”, 
therefore, the call to __bdos is Not in the same routine as the instantiation 
of the object, As a result, the TYPE info and the attached counted_by info of 
the object can NOT be USED by the __bdos call.



But __bos/__bdos are barely useful without optimization; you need a 
minimum of -O1.  You're right that if the call is never inlined then we 
don't care because the __bdos call does not get expanded to obj->size.


However, the point of situation 2 is that the TYPE info cannot be used 
by the __bdos call *only for a while* (i.e. until the call gets inlined) 
and that window is an opportunity for the reordering/DSE to break things.


Thanks.
Sid


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Martin Uecker
Am Dienstag, dem 24.10.2023 um 20:30 + schrieb Qing Zhao:
> Hi, Sid,
> 
> Really appreciate for your example and detailed explanation. Very helpful.
> I think that this example is an excellent example to show (almost) all the 
> issues we need to consider.
> 
> I slightly modified this example to make it to be compilable and run-able, as 
> following: 
> (but I still cannot make the incorrect reordering or DSE happening, anyway, 
> the potential reordering possibility is there…)
> 
>   1 #include 
>   2 struct A
>   3 {
>   4  size_t size;
>   5  char buf[] __attribute__((counted_by(size)));
>   6 };
>   7 
>   8 static size_t
>   9 get_size_from (void *ptr)
>  10 {
>  11  return __builtin_dynamic_object_size (ptr, 1);
>  12 }
>  13 
>  14 void
>  15 foo (size_t sz)
>  16 {
>  17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
>  18  obj->size = sz;
>  19  obj->buf[0] = 2;
>  20  __builtin_printf (“%d\n", get_size_from (obj->buf));
>  21  return;
>  22 }
>  23 
>  24 int main ()
>  25 {
>  26  foo (20);
>  27  return 0;
>  28 }
> 
> With my GCC, it was compiled and worked:
> [opc@qinzhao-ol8u3-x86 ]$  /home/opc/Install/latest-d/bin/gcc -O1 t5.c
> [opc@qinzhao-ol8u3-x86 ]$ ./a.out
> 20
> Situation 1: With O1 and above, the routine “get_size_from” was inlined into 
> “foo”, therefore, the call to __bdos is in the same routine as the 
> instantiation of the object, and the TYPE information and the attached 
> counted_by attribute information in the TYPE of the object can be USED by the 
> __bdos call to compute the final object size. 
> 
> [opc@qinzhao-ol8u3-x86]$  /home/opc/Install/latest-d/bin/gcc -O0  t5.c
> [opc@qinzhao-ol8u3-x86 ]$ ./a.out
> -1
> Situation 2: With O0, the routine “get_size_from” was NOT inlined into “foo”, 
> therefore, the call to __bdos is Not in the same routine as the instantiation 
> of the object, As a result, the TYPE info and the attached counted_by info of 
> the object can NOT be USED by the __bdos call. 
> 
> Keep in mind of the above 2 situations, we will refer them in below:
> 
> 1. First,  the problem we are trying to resolve is:
> 
> (Your description):
> 
> >  the reordering of __bdos w.r.t. initialization of the size parameter but 
> > to also account for DSE of the assignment, we can abstract this problem to 
> > that of DFA being unable to see implicit use of the size parameter in the 
> > __bdos call.
> 
> basically is correct.  However, with the following exception:
> 
> The implicit use of the size parameter in the __bdos call is not always 
> there, it ONLY exists WHEN the __bdos is able to evaluated to an expression 
> of the size parameter in the “objsz” phase, i.e., the “Situation 1” of the 
> above example. 
>  In the “Situation 2”, when the __bdos does not see the TYPE of the real 
> object,  it does not see the counted_by information from the TYPE, therefore, 
>  it is not able to evaluate the size of the object through the counted_by 
> information.  As a result, the implicit use of the size parameter in the 
> __bdos call does NOT exist at all.  The optimizer can freely reorder the 
> initialization of the size parameter with the __bdos call since there is no 
> data flow dependency between these two. 
> 
> With this exception in mind, we can see that your proposed “option 2” (making 
> the type of size “volatile”) is too conservative, it will  disable many 
> optimizations  unnecessarily, even though it’s safe and simple to implement. 
> 
> As a compiler optimization person for many many years, I really don’t want to 
> take this approach at this moment.  -:)
> 
> 2. Some facts I’d like to mention:
> 
> A.  The incorrect reordering (or CSE) potential ONLY exists in the TREE 
> optimization stage. During RTL stage,  the __bdos call has already been 
> replaced by an expression of the size parameter or a constant, the data 
> dependency is explicitly in the IR already.  I believe that the data analysis 
> in RTL stage should pick up the data dependency correctly, No special 
> handling is needed in RTL.
> 
> B. If the __bdos call cannot see the real object , it has no way to get the 
> “counted_by” field from the TYPE of the real object. So, if we try to add the 
> implicit use of the “counted_by” field to the __bdos call, the object 
> instantiation should be in the same routine as the __bdos call.  Both the FE 
> and the gimplification phase are too early to do this work. 
> 
> 2. Then, what’s the best approach to resolve this problem:
> 
> There were several suggestions so far:
> 
> A.  Add an additional argument, the size parameter,  to __bdos, 
>   A.1, during FE;
>   A.2, during gimplification phase;
> B.  Encode the implicit USE  in the type of size, to make the size “volatile”;
> C.  Encode the implicit USE  in the type of buf, then update the optimization 
> passes to use this implicit USE encoded in the type of buf.
> 
> As I explained in the above, 
> ** Approach A (both A.1 and A.2) does not work;
> ** 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-24 Thread Qing Zhao
Hi, Sid,

Really appreciate for your example and detailed explanation. Very helpful.
I think that this example is an excellent example to show (almost) all the 
issues we need to consider.

I slightly modified this example to make it to be compilable and run-able, as 
following: 
(but I still cannot make the incorrect reordering or DSE happening, anyway, the 
potential reordering possibility is there…)

  1 #include 
  2 struct A
  3 {
  4  size_t size;
  5  char buf[] __attribute__((counted_by(size)));
  6 };
  7 
  8 static size_t
  9 get_size_from (void *ptr)
 10 {
 11  return __builtin_dynamic_object_size (ptr, 1);
 12 }
 13 
 14 void
 15 foo (size_t sz)
 16 {
 17  struct A *obj = __builtin_malloc (sizeof(struct A) + sz * sizeof(char));
 18  obj->size = sz;
 19  obj->buf[0] = 2;
 20  __builtin_printf (“%d\n", get_size_from (obj->buf));
 21  return;
 22 }
 23 
 24 int main ()
 25 {
 26  foo (20);
 27  return 0;
 28 }

With my GCC, it was compiled and worked:
[opc@qinzhao-ol8u3-x86 ]$  /home/opc/Install/latest-d/bin/gcc -O1 t5.c
[opc@qinzhao-ol8u3-x86 ]$ ./a.out
20
Situation 1: With O1 and above, the routine “get_size_from” was inlined into 
“foo”, therefore, the call to __bdos is in the same routine as the 
instantiation of the object, and the TYPE information and the attached 
counted_by attribute information in the TYPE of the object can be USED by the 
__bdos call to compute the final object size. 

[opc@qinzhao-ol8u3-x86]$  /home/opc/Install/latest-d/bin/gcc -O0  t5.c
[opc@qinzhao-ol8u3-x86 ]$ ./a.out
-1
Situation 2: With O0, the routine “get_size_from” was NOT inlined into “foo”, 
therefore, the call to __bdos is Not in the same routine as the instantiation 
of the object, As a result, the TYPE info and the attached counted_by info of 
the object can NOT be USED by the __bdos call. 

Keep in mind of the above 2 situations, we will refer them in below:

1. First,  the problem we are trying to resolve is:

(Your description):

>  the reordering of __bdos w.r.t. initialization of the size parameter but to 
> also account for DSE of the assignment, we can abstract this problem to that 
> of DFA being unable to see implicit use of the size parameter in the __bdos 
> call.

basically is correct.  However, with the following exception:

The implicit use of the size parameter in the __bdos call is not always there, 
it ONLY exists WHEN the __bdos is able to evaluated to an expression of the 
size parameter in the “objsz” phase, i.e., the “Situation 1” of the above 
example. 
 In the “Situation 2”, when the __bdos does not see the TYPE of the real 
object,  it does not see the counted_by information from the TYPE, therefore,  
it is not able to evaluate the size of the object through the counted_by 
information.  As a result, the implicit use of the size parameter in the __bdos 
call does NOT exist at all.  The optimizer can freely reorder the 
initialization of the size parameter with the __bdos call since there is no 
data flow dependency between these two. 

With this exception in mind, we can see that your proposed “option 2” (making 
the type of size “volatile”) is too conservative, it will  disable many 
optimizations  unnecessarily, even though it’s safe and simple to implement. 

As a compiler optimization person for many many years, I really don’t want to 
take this approach at this moment.  -:)

2. Some facts I’d like to mention:

A.  The incorrect reordering (or CSE) potential ONLY exists in the TREE 
optimization stage. During RTL stage,  the __bdos call has already been 
replaced by an expression of the size parameter or a constant, the data 
dependency is explicitly in the IR already.  I believe that the data analysis 
in RTL stage should pick up the data dependency correctly, No special handling 
is needed in RTL.

B. If the __bdos call cannot see the real object , it has no way to get the 
“counted_by” field from the TYPE of the real object. So, if we try to add the 
implicit use of the “counted_by” field to the __bdos call, the object 
instantiation should be in the same routine as the __bdos call.  Both the FE 
and the gimplification phase are too early to do this work. 

2. Then, what’s the best approach to resolve this problem:

There were several suggestions so far:

A.  Add an additional argument, the size parameter,  to __bdos, 
  A.1, during FE;
  A.2, during gimplification phase;
B.  Encode the implicit USE  in the type of size, to make the size “volatile”;
C.  Encode the implicit USE  in the type of buf, then update the optimization 
passes to use this implicit USE encoded in the type of buf.

As I explained in the above, 
** Approach A (both A.1 and A.2) does not work;
** Approach B will have big performance impact, I’d prefer not to take this 
approach at this moment.
** Approach C will be a lot of change in GCC, and also not very necessary since 
the ONLY implicit use of the size parameter is in the __bdos call when __bdos 
can see the real object.

So, all the above 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Siddhesh Poyarekar

On 2023-10-23 15:43, Qing Zhao wrote:




On Oct 23, 2023, at 2:43 PM, Siddhesh Poyarekar  wrote:

On 2023-10-23 14:06, Martin Uecker wrote:

We should aim for a good integration with the BDOS pass, so
that it can propagate the information further, e.g. the
following should work:
struct { int L; char buf[] __counted_by(L) } x;
x.L = N;
x.buf = ...;
char *p = >f;
__bdos(p) -> N
So we need to be smart on how we provide the size
information for x->f to the backend.
This would also be desirable for the language extension.


This is essentially why there need to be frontend rules constraining reordering 
and reachability semantics of x.L, thus restricting DSE and reordering for it.


My understanding is that Restricting DSE and reordering should be done by the 
proper data flow information, with a new argument added to the BDOS call, this 
correct data flow information could be maintained, and then the DSE and 
reordering will not happen.

I don’t quite understand what kind of frontend rules should be added to 
constrain reordering and reachability semantics? Can you explain this a little 
bit more? Do you mean to add some rules or requirment to the new attribute that 
the users of the attribute should follow in the source code?


Yes, but let me try and summarize the issues and the potential solutions 
at the end:





  This is not really a __bdos/__bos question, because that bit is trivial; if 
the structure is visible, the value is simply x.L.  This is also why adding a 
reference to x.L in __bos/__bdos is not sufficient or even possible in, e.g. 
the above case you note.


I am a little confused here, are we discussing how to resolve the potential 
reordering issue of the following:

"
struct annotated {
   size_t foo;
   char array[] __attribute__((counted_by (foo)));
};

   p->foo = 10;
   size = __builtin_dynamic_object_size (p->array,1);
“?

Or a bigger issue?


Right, so the problem we're trying to solve is the reordering of __bdos 
w.r.t. initialization of the size parameter but to also account for DSE 
of the assignment, we can abstract this problem to that of DFA being 
unable to see implicit use of the size parameter.  __bdos is the one 
such implicit user of the size parameter and you're proposing to solve 
this by encoding the relationship between buffer and size at the __bdos 
call site.  But what about the case when the instantiation of the object 
is not at the same place as the __bdos call site, i.e. the DFA is unable 
to make that relationship?


The example Martin showed where the subobject gets "hidden" behind a 
pointer was a trivial one where DFA *may* actually work in practice 
(because the object-size pass can thread through these assignments) but 
think about this one:


struct A
{
  size_t size;
  char buf[] __attribute__((counted_by(size)));
}

static size_t
get_size_of (void *ptr)
{
  return __bdos (ptr, 1);
}

void
foo (size_t sz)
{
  struct A *obj = __builtin_malloc (sz);
  obj->size = sz;

  ...
  __builtin_printf ("%zu\n", get_size_of (obj->array));
  ...
}

Until get_size_of is inlined, no DFA can see the __bdos call in the same 
place as the point where obj is allocated.  As a result, the assignment 
to obj->size could get reordered (or the store eliminated) w.r.t. the 
__bdos call until the inlining happens.


As a result, the relationship between buf and size established by the 
attribute needs to be encoded into the type somehow.  There are two options:


Option 1: Encode the relationship in the type of buf

This is kinda what you end up doing with component_ref_has_counted_by 
and it does show the relationship if one is looking (through that call), 
but nothing more that can be used to, e.g. prevent reordering or tell 
the optimizer that the reference to the buf member may imply a reference 
to the size member as well.  This could be remedied by somehow encoding 
the USES relationship for size into the type of buf that the 
optimization passes can see.  I feel like this may be a bit convoluted 
to specify in a future language extension in a way that will actually be 
well understood by developers, but it will likely generate faster 
runtime code.  This will also likely require a bigger change across passes.


Option 2: Encode the relationship in the type of size

The other option is to enhance the type of size somehow so that it 
discourages reordering and store elimination, basically pessimizing 
code.  I think volatile semantics might be the way to do this and may 
even be straightforward to specify in the future language extension 
given that it builds on a known language construct and is thematically 
related.  However it does pessimize output for code that implements 
__counted_by__.


Thanks,
Sid


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 3:37 PM, Martin Uecker  wrote:
> 
> Am Montag, dem 23.10.2023 um 19:00 + schrieb Qing Zhao:
>> 
>>> On Oct 23, 2023, at 2:31 PM, Martin Uecker  wrote:
>>> 
>>> Am Montag, dem 23.10.2023 um 20:06 +0200 schrieb Martin Uecker:
 Am Montag, dem 23.10.2023 um 16:37 + schrieb Qing Zhao:
> 
>> On Oct 23, 2023, at 11:57 AM, Richard Biener 
>>  wrote:
>> 
>> 
>> 
>>> Am 23.10.2023 um 16:56 schrieb Qing Zhao :
>>> 
>>> 
>>> 
 On Oct 23, 2023, at 3:57 AM, Richard Biener 
  wrote:
 
> On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  
> wrote:
> 
> 
> 
>> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar 
>>  wrote:
>> 
>> On 2023-10-20 14:38, Qing Zhao wrote:
>>> How about the following:
>>> Add one more parameter to __builtin_dynamic_object_size(), i.e
>>> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
>>> When we see the structure field has counted_by attribute.
>> 
>> Or maybe add a barrier preventing any assignments to 
>> array_annotated->foo from being reordered below the __bdos call? 
>> Basically an __asm__ with array_annotated->foo in the clobber list 
>> ought to do it I think.
> 
> Maybe just adding the array_annotated->foo to the use list of the 
> call to __builtin_dynamic_object_size should be enough?
> 
> But I am not sure how to implement this in the TREE level, is there a 
> USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
> counted_by field “array_annotated->foo” to the USE_LIST of the call 
> to __bdos?
> 
> This might be the simplest solution?
 
 If the dynamic object size is derived of a field then I think you need 
 to
 put the "load" of that memory location at the point (as argument)
 of the __bos call right at parsing time.  I know that's awkward because
 you try to play tricks "discovering" that field only late, but that's 
 not
 going to work.
>>> 
>>> Is it better to do this at gimplification phase instead of FE? 
>>> 
>>> VLA decls are handled in gimplification phase, the size calculation and 
>>> call to alloca are all generated during this phase. (gimplify_vla_decl).
>>> 
>>> For __bdos calls, we can add an additional argument if the object’s 
>>> first argument’s type include the counted_by attribute, i.e
>>> 
>>> ***During gimplification, 
>>> For a call to __builtin_dynamic_object_size (ptr, type)
>>> Check whether the type of ptr includes counted_by attribute, if so, 
>>> change the call to
>>> __builtin_dynamic_object_size (ptr, type, counted_by field)
>>> 
>>> Then the correct data dependence should be represented well in the IR.
>>> 
>>> **During object size phase,
>>> 
>>> The call to __builtin_dynamic_object_size will become an expression 
>>> includes the counted_by field or -1/0 when we cannot decide the size, 
>>> the correct data dependence will be kept even the call to 
>>> __builtin_dynamic_object_size is gone. 
>> 
>> But the whole point of the BOS pass is to derive information that is not 
>> available at parsing time, and that’s the cases you are after.  The case 
>> where the connection to the field with the length is apparent during 
>> parsing is easy - you simply insert a load of the value before the BOS 
>> call.
> 
> Yes, this is true. 
> I prefer to implement this in gimplification phase since I am more 
> familiar with the code there.. (I think that implementing it in 
> gimplification should be very similar as implementing it in FE? Or do I 
> miss anything here?)
> 
> Joseph, if implement this in FE, where in the FE I should look at? 
> 
 
 We should aim for a good integration with the BDOS pass, so
 that it can propagate the information further, e.g. the 
 following should work:
 
 struct { int L; char buf[] __counted_by(L) } x;
 x.L = N;
 x.buf = ...;
 char *p = >f;
 __bdos(p) -> N
 
 So we need to be smart on how we provide the size
 information for x->f to the backend. 
>>> 
>>> To follow up on this. I do not think we should change the
>>> builtin in the FE or gimplification. Instead, we want 
>>> to change the field access and compute the size there. 
>> Could you please clarify on this? What do you mean by
>> "change the field access and compute the size there”?
> 
> I think the FE should essentially give the
> type
> 
> char [buf.L]
> 
> to buf.x;
> 
> If the type (or its size) could be preserved
> at this point so that it can be later
> discovered by __bdos, then it could know 
> the size and propagate it further.

Currently, we already store the 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 2:43 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-10-23 14:06, Martin Uecker wrote:
>> We should aim for a good integration with the BDOS pass, so
>> that it can propagate the information further, e.g. the
>> following should work:
>> struct { int L; char buf[] __counted_by(L) } x;
>> x.L = N;
>> x.buf = ...;
>> char *p = >f;
>> __bdos(p) -> N
>> So we need to be smart on how we provide the size
>> information for x->f to the backend.
>> This would also be desirable for the language extension.
> 
> This is essentially why there need to be frontend rules constraining 
> reordering and reachability semantics of x.L, thus restricting DSE and 
> reordering for it.

My understanding is that Restricting DSE and reordering should be done by the 
proper data flow information, with a new argument added to the BDOS call, this 
correct data flow information could be maintained, and then the DSE and 
reordering will not happen. 

I don’t quite understand what kind of frontend rules should be added to 
constrain reordering and reachability semantics? Can you explain this a little 
bit more? Do you mean to add some rules or requirment to the new attribute that 
the users of the attribute should follow in the source code? 

>  This is not really a __bdos/__bos question, because that bit is trivial; if 
> the structure is visible, the value is simply x.L.  This is also why adding a 
> reference to x.L in __bos/__bdos is not sufficient or even possible in, e.g. 
> the above case you note.

I am a little confused here, are we discussing how to resolve the potential 
reordering issue of the following:

"
struct annotated {
  size_t foo;
  char array[] __attribute__((counted_by (foo)));
};

  p->foo = 10;
  size = __builtin_dynamic_object_size (p->array,1);
“?

Or a bigger issue?

Qing

> 
> Thanks,
> Sid



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Martin Uecker
Am Montag, dem 23.10.2023 um 19:00 + schrieb Qing Zhao:
> 
> > On Oct 23, 2023, at 2:31 PM, Martin Uecker  wrote:
> > 
> > Am Montag, dem 23.10.2023 um 20:06 +0200 schrieb Martin Uecker:
> > > Am Montag, dem 23.10.2023 um 16:37 + schrieb Qing Zhao:
> > > > 
> > > > > On Oct 23, 2023, at 11:57 AM, Richard Biener 
> > > > >  wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > > Am 23.10.2023 um 16:56 schrieb Qing Zhao :
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > On Oct 23, 2023, at 3:57 AM, Richard Biener 
> > > > > > >  wrote:
> > > > > > > 
> > > > > > > > On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao 
> > > > > > > >  wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar 
> > > > > > > > >  wrote:
> > > > > > > > > 
> > > > > > > > > On 2023-10-20 14:38, Qing Zhao wrote:
> > > > > > > > > > How about the following:
> > > > > > > > > > Add one more parameter to __builtin_dynamic_object_size(), 
> > > > > > > > > > i.e
> > > > > > > > > > __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
> > > > > > > > > > When we see the structure field has counted_by attribute.
> > > > > > > > > 
> > > > > > > > > Or maybe add a barrier preventing any assignments to 
> > > > > > > > > array_annotated->foo from being reordered below the __bdos 
> > > > > > > > > call? Basically an __asm__ with array_annotated->foo in the 
> > > > > > > > > clobber list ought to do it I think.
> > > > > > > > 
> > > > > > > > Maybe just adding the array_annotated->foo to the use list of 
> > > > > > > > the call to __builtin_dynamic_object_size should be enough?
> > > > > > > > 
> > > > > > > > But I am not sure how to implement this in the TREE level, is 
> > > > > > > > there a USE_LIST/CLOBBER_LIST for each call?  Then I can just 
> > > > > > > > simply add the counted_by field “array_annotated->foo” to the 
> > > > > > > > USE_LIST of the call to __bdos?
> > > > > > > > 
> > > > > > > > This might be the simplest solution?
> > > > > > > 
> > > > > > > If the dynamic object size is derived of a field then I think you 
> > > > > > > need to
> > > > > > > put the "load" of that memory location at the point (as argument)
> > > > > > > of the __bos call right at parsing time.  I know that's awkward 
> > > > > > > because
> > > > > > > you try to play tricks "discovering" that field only late, but 
> > > > > > > that's not
> > > > > > > going to work.
> > > > > > 
> > > > > > Is it better to do this at gimplification phase instead of FE? 
> > > > > > 
> > > > > > VLA decls are handled in gimplification phase, the size calculation 
> > > > > > and call to alloca are all generated during this phase. 
> > > > > > (gimplify_vla_decl).
> > > > > > 
> > > > > > For __bdos calls, we can add an additional argument if the object’s 
> > > > > > first argument’s type include the counted_by attribute, i.e
> > > > > > 
> > > > > > ***During gimplification, 
> > > > > > For a call to __builtin_dynamic_object_size (ptr, type)
> > > > > > Check whether the type of ptr includes counted_by attribute, if so, 
> > > > > > change the call to
> > > > > > __builtin_dynamic_object_size (ptr, type, counted_by field)
> > > > > > 
> > > > > > Then the correct data dependence should be represented well in the 
> > > > > > IR.
> > > > > > 
> > > > > > **During object size phase,
> > > > > > 
> > > > > > The call to __builtin_dynamic_object_size will become an expression 
> > > > > > includes the counted_by field or -1/0 when we cannot decide the 
> > > > > > size, the correct data dependence will be kept even the call to 
> > > > > > __builtin_dynamic_object_size is gone. 
> > > > > 
> > > > > But the whole point of the BOS pass is to derive information that is 
> > > > > not available at parsing time, and that’s the cases you are after.  
> > > > > The case where the connection to the field with the length is 
> > > > > apparent during parsing is easy - you simply insert a load of the 
> > > > > value before the BOS call.
> > > > 
> > > > Yes, this is true. 
> > > > I prefer to implement this in gimplification phase since I am more 
> > > > familiar with the code there.. (I think that implementing it in 
> > > > gimplification should be very similar as implementing it in FE? Or do I 
> > > > miss anything here?)
> > > > 
> > > > Joseph, if implement this in FE, where in the FE I should look at? 
> > > > 
> > > 
> > > We should aim for a good integration with the BDOS pass, so
> > > that it can propagate the information further, e.g. the 
> > > following should work:
> > > 
> > > struct { int L; char buf[] __counted_by(L) } x;
> > > x.L = N;
> > > x.buf = ...;
> > > char *p = >f;
> > > __bdos(p) -> N
> > > 
> > > So we need to be smart on how we provide the size
> > > information for x->f to the backend. 
> > 
> > To follow up on this. I do not think we should change the
> > builtin in the FE or gimplification. Instead, we want 
> > to change the field access and compute the 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 2:31 PM, Martin Uecker  wrote:
> 
> Am Montag, dem 23.10.2023 um 20:06 +0200 schrieb Martin Uecker:
>> Am Montag, dem 23.10.2023 um 16:37 + schrieb Qing Zhao:
>>> 
 On Oct 23, 2023, at 11:57 AM, Richard Biener  
 wrote:
 
 
 
> Am 23.10.2023 um 16:56 schrieb Qing Zhao :
> 
> 
> 
>> On Oct 23, 2023, at 3:57 AM, Richard Biener  
>> wrote:
>> 
>>> On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:
>>> 
>>> 
>>> 
 On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  
 wrote:
 
 On 2023-10-20 14:38, Qing Zhao wrote:
> How about the following:
> Add one more parameter to __builtin_dynamic_object_size(), i.e
> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
> When we see the structure field has counted_by attribute.
 
 Or maybe add a barrier preventing any assignments to 
 array_annotated->foo from being reordered below the __bdos call? 
 Basically an __asm__ with array_annotated->foo in the clobber list 
 ought to do it I think.
>>> 
>>> Maybe just adding the array_annotated->foo to the use list of the call 
>>> to __builtin_dynamic_object_size should be enough?
>>> 
>>> But I am not sure how to implement this in the TREE level, is there a 
>>> USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
>>> counted_by field “array_annotated->foo” to the USE_LIST of the call to 
>>> __bdos?
>>> 
>>> This might be the simplest solution?
>> 
>> If the dynamic object size is derived of a field then I think you need to
>> put the "load" of that memory location at the point (as argument)
>> of the __bos call right at parsing time.  I know that's awkward because
>> you try to play tricks "discovering" that field only late, but that's not
>> going to work.
> 
> Is it better to do this at gimplification phase instead of FE? 
> 
> VLA decls are handled in gimplification phase, the size calculation and 
> call to alloca are all generated during this phase. (gimplify_vla_decl).
> 
> For __bdos calls, we can add an additional argument if the object’s first 
> argument’s type include the counted_by attribute, i.e
> 
> ***During gimplification, 
> For a call to __builtin_dynamic_object_size (ptr, type)
> Check whether the type of ptr includes counted_by attribute, if so, 
> change the call to
> __builtin_dynamic_object_size (ptr, type, counted_by field)
> 
> Then the correct data dependence should be represented well in the IR.
> 
> **During object size phase,
> 
> The call to __builtin_dynamic_object_size will become an expression 
> includes the counted_by field or -1/0 when we cannot decide the size, the 
> correct data dependence will be kept even the call to 
> __builtin_dynamic_object_size is gone. 
 
 But the whole point of the BOS pass is to derive information that is not 
 available at parsing time, and that’s the cases you are after.  The case 
 where the connection to the field with the length is apparent during 
 parsing is easy - you simply insert a load of the value before the BOS 
 call.
>>> 
>>> Yes, this is true. 
>>> I prefer to implement this in gimplification phase since I am more familiar 
>>> with the code there.. (I think that implementing it in gimplification 
>>> should be very similar as implementing it in FE? Or do I miss anything 
>>> here?)
>>> 
>>> Joseph, if implement this in FE, where in the FE I should look at? 
>>> 
>> 
>> We should aim for a good integration with the BDOS pass, so
>> that it can propagate the information further, e.g. the 
>> following should work:
>> 
>> struct { int L; char buf[] __counted_by(L) } x;
>> x.L = N;
>> x.buf = ...;
>> char *p = >f;
>> __bdos(p) -> N
>> 
>> So we need to be smart on how we provide the size
>> information for x->f to the backend. 
> 
> To follow up on this. I do not think we should change the
> builtin in the FE or gimplification. Instead, we want 
> to change the field access and compute the size there. 
Could you please clarify on this? What do you mean by "change the field access 
and compute the size there”?
> 
> In my toy patch I then made this have a VLA type that 
> encodes the size.  Here, this would need to be done 
> differently.
> 
> But still, what we are missing in both cases
> is a proper way to pass the information down to BDOS.

What’ s the issue with adding a new argument (x.L) to the BDOS call? What’s 
missing with this approach?

> 
> For VLAs this works because BDOS can see the size of
> the definition.  For calls to allocation functions
> it is read from an attribute. 

You mean for VLA, BDOS see the size of the definition from the attribute for 
the allocation function?
Yes, that’s the case for VLA. 

For VLA, the size computation and 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Martin Uecker
Am Montag, dem 23.10.2023 um 14:43 -0400 schrieb Siddhesh Poyarekar:
> On 2023-10-23 14:06, Martin Uecker wrote:
> > We should aim for a good integration with the BDOS pass, so
> > that it can propagate the information further, e.g. the
> > following should work:
> > 
> > struct { int L; char buf[] __counted_by(L) } x;
> > x.L = N;
> > x.buf = ...;
> > char *p = >f;
> > __bdos(p) -> N
> > 
> > So we need to be smart on how we provide the size
> > information for x->f to the backend.
> > 
> > This would also be desirable for the language extension.
> 
> This is essentially why there need to be frontend rules constraining 
> reordering and reachability semantics of x.L, thus restricting DSE and 
> reordering for it. 

Yes, this too.

>  This is not really a __bdos/__bos question, because 
> that bit is trivial; if the structure is visible, the value is simply 
> x.L.  This is also why adding a reference to x.L in __bos/__bdos is not 
> sufficient or even possible in, e.g. the above case you note.

The value x.L may change in time. I would argue that it needs
to be the value of x.L at the time where x.buf (not x->f, sorry) 
is accessed.  So the FE needs to evaluate x.L when x.buf is
accessed and store the value somewhere where __bdos can find
it later.  In the type information would make sense.

But I am not sure how to do this in the best way so that this 
information is not removed later when not used explicitely
before __bdos tries to look at it.

Martin





Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Siddhesh Poyarekar

On 2023-10-23 14:06, Martin Uecker wrote:

We should aim for a good integration with the BDOS pass, so
that it can propagate the information further, e.g. the
following should work:

struct { int L; char buf[] __counted_by(L) } x;
x.L = N;
x.buf = ...;
char *p = >f;
__bdos(p) -> N

So we need to be smart on how we provide the size
information for x->f to the backend.

This would also be desirable for the language extension.


This is essentially why there need to be frontend rules constraining 
reordering and reachability semantics of x.L, thus restricting DSE and 
reordering for it.  This is not really a __bdos/__bos question, because 
that bit is trivial; if the structure is visible, the value is simply 
x.L.  This is also why adding a reference to x.L in __bos/__bdos is not 
sufficient or even possible in, e.g. the above case you note.


Thanks,
Sid


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 2:06 PM, Martin Uecker  wrote:
> 
> Am Montag, dem 23.10.2023 um 16:37 + schrieb Qing Zhao:
>> 
>>> On Oct 23, 2023, at 11:57 AM, Richard Biener  
>>> wrote:
>>> 
>>> 
>>> 
 Am 23.10.2023 um 16:56 schrieb Qing Zhao :
 
 
 
> On Oct 23, 2023, at 3:57 AM, Richard Biener  
> wrote:
> 
>> On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  
>>> wrote:
>>> 
>>> On 2023-10-20 14:38, Qing Zhao wrote:
 How about the following:
 Add one more parameter to __builtin_dynamic_object_size(), i.e
 __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
 When we see the structure field has counted_by attribute.
>>> 
>>> Or maybe add a barrier preventing any assignments to 
>>> array_annotated->foo from being reordered below the __bdos call? 
>>> Basically an __asm__ with array_annotated->foo in the clobber list 
>>> ought to do it I think.
>> 
>> Maybe just adding the array_annotated->foo to the use list of the call 
>> to __builtin_dynamic_object_size should be enough?
>> 
>> But I am not sure how to implement this in the TREE level, is there a 
>> USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
>> counted_by field “array_annotated->foo” to the USE_LIST of the call to 
>> __bdos?
>> 
>> This might be the simplest solution?
> 
> If the dynamic object size is derived of a field then I think you need to
> put the "load" of that memory location at the point (as argument)
> of the __bos call right at parsing time.  I know that's awkward because
> you try to play tricks "discovering" that field only late, but that's not
> going to work.
 
 Is it better to do this at gimplification phase instead of FE? 
 
 VLA decls are handled in gimplification phase, the size calculation and 
 call to alloca are all generated during this phase. (gimplify_vla_decl).
 
 For __bdos calls, we can add an additional argument if the object’s first 
 argument’s type include the counted_by attribute, i.e
 
 ***During gimplification, 
 For a call to __builtin_dynamic_object_size (ptr, type)
 Check whether the type of ptr includes counted_by attribute, if so, change 
 the call to
 __builtin_dynamic_object_size (ptr, type, counted_by field)
 
 Then the correct data dependence should be represented well in the IR.
 
 **During object size phase,
 
 The call to __builtin_dynamic_object_size will become an expression 
 includes the counted_by field or -1/0 when we cannot decide the size, the 
 correct data dependence will be kept even the call to 
 __builtin_dynamic_object_size is gone. 
>>> 
>>> But the whole point of the BOS pass is to derive information that is not 
>>> available at parsing time, and that’s the cases you are after.  The case 
>>> where the connection to the field with the length is apparent during 
>>> parsing is easy - you simply insert a load of the value before the BOS call.
>> 
>> Yes, this is true. 
>> I prefer to implement this in gimplification phase since I am more familiar 
>> with the code there.. (I think that implementing it in gimplification should 
>> be very similar as implementing it in FE? Or do I miss anything here?)
>> 
>> Joseph, if implement this in FE, where in the FE I should look at? 
>> 
> 
> We should aim for a good integration with the BDOS pass, so
> that it can propagate the information further, e.g. the 
> following should work:
> 
> struct { int L; char buf[] __counted_by(L) } x;
> x.L = N;
> x.buf = ...;
> char *p = >f;
Is the above line should be: 
char *p = 
?
> __bdos(p) -> N
> 
> So we need to be smart on how we provide the size
> information for x->f to the backend. 

Do you have any other suggestion here?

(Right now, what we’d like to do is to add one more argument for the function 
__bdos as
 __bdos (p, type, x.L))
> 
> This would also be desirable for the language extension. 

Yes.

Qing
> 
> Martin
> 
> 
>> Thanks a lot for the help.
>> 
>> Qing
>> 
>>> For the late case there’s no way to invent data flow dependence without 
>>> inadvertently pessimizing optimization.
>>> 
>>> Richard 
>>> 
 
> 
> A related issue is that assignment to the field and storage allocation
> are not tied together
 
 Yes, this is different from VLA, in which, the size assignment and the 
 storage allocation are generated and tied together by the compiler.
 
 For the flexible array member, the storage allocation and the size 
 assignment are all done by the user. So, We need to clarify such 
 requirement  in the document to guide user to write correct code.  And 
 also, we might need to provide tools (warnings and sanitizer option) to 
 help users to catch such coding error.
 
> - if 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Martin Uecker
Am Montag, dem 23.10.2023 um 20:06 +0200 schrieb Martin Uecker:
> Am Montag, dem 23.10.2023 um 16:37 + schrieb Qing Zhao:
> > 
> > > On Oct 23, 2023, at 11:57 AM, Richard Biener  
> > > wrote:
> > > 
> > > 
> > > 
> > > > Am 23.10.2023 um 16:56 schrieb Qing Zhao :
> > > > 
> > > > 
> > > > 
> > > > > On Oct 23, 2023, at 3:57 AM, Richard Biener 
> > > > >  wrote:
> > > > > 
> > > > > > On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  
> > > > > > wrote:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar 
> > > > > > >  wrote:
> > > > > > > 
> > > > > > > On 2023-10-20 14:38, Qing Zhao wrote:
> > > > > > > > How about the following:
> > > > > > > > Add one more parameter to __builtin_dynamic_object_size(), i.e
> > > > > > > > __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
> > > > > > > > When we see the structure field has counted_by attribute.
> > > > > > > 
> > > > > > > Or maybe add a barrier preventing any assignments to 
> > > > > > > array_annotated->foo from being reordered below the __bdos call? 
> > > > > > > Basically an __asm__ with array_annotated->foo in the clobber 
> > > > > > > list ought to do it I think.
> > > > > > 
> > > > > > Maybe just adding the array_annotated->foo to the use list of the 
> > > > > > call to __builtin_dynamic_object_size should be enough?
> > > > > > 
> > > > > > But I am not sure how to implement this in the TREE level, is there 
> > > > > > a USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add 
> > > > > > the counted_by field “array_annotated->foo” to the USE_LIST of the 
> > > > > > call to __bdos?
> > > > > > 
> > > > > > This might be the simplest solution?
> > > > > 
> > > > > If the dynamic object size is derived of a field then I think you 
> > > > > need to
> > > > > put the "load" of that memory location at the point (as argument)
> > > > > of the __bos call right at parsing time.  I know that's awkward 
> > > > > because
> > > > > you try to play tricks "discovering" that field only late, but that's 
> > > > > not
> > > > > going to work.
> > > > 
> > > > Is it better to do this at gimplification phase instead of FE? 
> > > > 
> > > > VLA decls are handled in gimplification phase, the size calculation and 
> > > > call to alloca are all generated during this phase. (gimplify_vla_decl).
> > > > 
> > > > For __bdos calls, we can add an additional argument if the object’s 
> > > > first argument’s type include the counted_by attribute, i.e
> > > > 
> > > > ***During gimplification, 
> > > > For a call to __builtin_dynamic_object_size (ptr, type)
> > > > Check whether the type of ptr includes counted_by attribute, if so, 
> > > > change the call to
> > > > __builtin_dynamic_object_size (ptr, type, counted_by field)
> > > > 
> > > > Then the correct data dependence should be represented well in the IR.
> > > > 
> > > > **During object size phase,
> > > > 
> > > > The call to __builtin_dynamic_object_size will become an expression 
> > > > includes the counted_by field or -1/0 when we cannot decide the size, 
> > > > the correct data dependence will be kept even the call to 
> > > > __builtin_dynamic_object_size is gone. 
> > > 
> > > But the whole point of the BOS pass is to derive information that is not 
> > > available at parsing time, and that’s the cases you are after.  The case 
> > > where the connection to the field with the length is apparent during 
> > > parsing is easy - you simply insert a load of the value before the BOS 
> > > call.
> > 
> > Yes, this is true. 
> > I prefer to implement this in gimplification phase since I am more familiar 
> > with the code there.. (I think that implementing it in gimplification 
> > should be very similar as implementing it in FE? Or do I miss anything 
> > here?)
> > 
> > Joseph, if implement this in FE, where in the FE I should look at? 
> > 
> 
> We should aim for a good integration with the BDOS pass, so
> that it can propagate the information further, e.g. the 
> following should work:
> 
> struct { int L; char buf[] __counted_by(L) } x;
> x.L = N;
> x.buf = ...;
> char *p = >f;
> __bdos(p) -> N
> 
> So we need to be smart on how we provide the size
> information for x->f to the backend. 

To follow up on this. I do not think we should change the
builtin in the FE or gimplification. Instead, we want 
to change the field access and compute the size there. 

In my toy patch I then made this have a VLA type that 
encodes the size.  Here, this would need to be done 
differently.

But still, what we are missing in both cases
is a proper way to pass the information down to BDOS.

For VLAs this works because BDOS can see the size of
the definition.  For calls to allocation functions
it is read from an attribute. 

But I am not sure what would be the best way to encode
this information so that BDOS can later access it.

Martin




> 
> This would also be desirable for the language extension. 
> 
> Martin
> 
> 
> > Thanks a lot for the 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Joseph Myers
On Mon, 23 Oct 2023, Qing Zhao wrote:

> I prefer to implement this in gimplification phase since I am more 
> familiar with the code there.. (I think that implementing it in 
> gimplification should be very similar as implementing it in FE? Or do I 
> miss anything here?)
> 
> Joseph, if implement this in FE, where in the FE I should look at? 

I tend to think that gimplification time is appropriate for adding this 
dependency, but if you wish to rewrite a built-in function call in the 
front end before then, it could be done in build_function_call_vec.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Martin Uecker
Am Montag, dem 23.10.2023 um 16:37 + schrieb Qing Zhao:
> 
> > On Oct 23, 2023, at 11:57 AM, Richard Biener  
> > wrote:
> > 
> > 
> > 
> > > Am 23.10.2023 um 16:56 schrieb Qing Zhao :
> > > 
> > > 
> > > 
> > > > On Oct 23, 2023, at 3:57 AM, Richard Biener 
> > > >  wrote:
> > > > 
> > > > > On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  
> > > > > wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > > On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar 
> > > > > >  wrote:
> > > > > > 
> > > > > > On 2023-10-20 14:38, Qing Zhao wrote:
> > > > > > > How about the following:
> > > > > > > Add one more parameter to __builtin_dynamic_object_size(), i.e
> > > > > > > __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
> > > > > > > When we see the structure field has counted_by attribute.
> > > > > > 
> > > > > > Or maybe add a barrier preventing any assignments to 
> > > > > > array_annotated->foo from being reordered below the __bdos call? 
> > > > > > Basically an __asm__ with array_annotated->foo in the clobber list 
> > > > > > ought to do it I think.
> > > > > 
> > > > > Maybe just adding the array_annotated->foo to the use list of the 
> > > > > call to __builtin_dynamic_object_size should be enough?
> > > > > 
> > > > > But I am not sure how to implement this in the TREE level, is there a 
> > > > > USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
> > > > > counted_by field “array_annotated->foo” to the USE_LIST of the call 
> > > > > to __bdos?
> > > > > 
> > > > > This might be the simplest solution?
> > > > 
> > > > If the dynamic object size is derived of a field then I think you need 
> > > > to
> > > > put the "load" of that memory location at the point (as argument)
> > > > of the __bos call right at parsing time.  I know that's awkward because
> > > > you try to play tricks "discovering" that field only late, but that's 
> > > > not
> > > > going to work.
> > > 
> > > Is it better to do this at gimplification phase instead of FE? 
> > > 
> > > VLA decls are handled in gimplification phase, the size calculation and 
> > > call to alloca are all generated during this phase. (gimplify_vla_decl).
> > > 
> > > For __bdos calls, we can add an additional argument if the object’s first 
> > > argument’s type include the counted_by attribute, i.e
> > > 
> > > ***During gimplification, 
> > > For a call to __builtin_dynamic_object_size (ptr, type)
> > > Check whether the type of ptr includes counted_by attribute, if so, 
> > > change the call to
> > > __builtin_dynamic_object_size (ptr, type, counted_by field)
> > > 
> > > Then the correct data dependence should be represented well in the IR.
> > > 
> > > **During object size phase,
> > > 
> > > The call to __builtin_dynamic_object_size will become an expression 
> > > includes the counted_by field or -1/0 when we cannot decide the size, the 
> > > correct data dependence will be kept even the call to 
> > > __builtin_dynamic_object_size is gone. 
> > 
> > But the whole point of the BOS pass is to derive information that is not 
> > available at parsing time, and that’s the cases you are after.  The case 
> > where the connection to the field with the length is apparent during 
> > parsing is easy - you simply insert a load of the value before the BOS call.
> 
> Yes, this is true. 
> I prefer to implement this in gimplification phase since I am more familiar 
> with the code there.. (I think that implementing it in gimplification should 
> be very similar as implementing it in FE? Or do I miss anything here?)
> 
> Joseph, if implement this in FE, where in the FE I should look at? 
> 

We should aim for a good integration with the BDOS pass, so
that it can propagate the information further, e.g. the 
following should work:

struct { int L; char buf[] __counted_by(L) } x;
x.L = N;
x.buf = ...;
char *p = >f;
__bdos(p) -> N

So we need to be smart on how we provide the size
information for x->f to the backend. 

This would also be desirable for the language extension. 

Martin


> Thanks a lot for the help.
> 
> Qing
> 
> >  For the late case there’s no way to invent data flow dependence without 
> > inadvertently pessimizing optimization.
> > 
> > Richard 
> > 
> > > 
> > > > 
> > > > A related issue is that assignment to the field and storage allocation
> > > > are not tied together
> > > 
> > > Yes, this is different from VLA, in which, the size assignment and the 
> > > storage allocation are generated and tied together by the compiler.
> > > 
> > > For the flexible array member, the storage allocation and the size 
> > > assignment are all done by the user. So, We need to clarify such 
> > > requirement  in the document to guide user to write correct code.  And 
> > > also, we might need to provide tools (warnings and sanitizer option) to 
> > > help users to catch such coding error.
> > > 
> > > > - if there's no use of the size data we might
> > > > remove the store of it as dead.
> > > 
> > > Yes, when __bdos cannot decide 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 11:57 AM, Richard Biener  
> wrote:
> 
> 
> 
>> Am 23.10.2023 um 16:56 schrieb Qing Zhao :
>> 
>> 
>> 
>>> On Oct 23, 2023, at 3:57 AM, Richard Biener  
>>> wrote:
>>> 
 On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:
 
 
 
> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  
> wrote:
> 
> On 2023-10-20 14:38, Qing Zhao wrote:
>> How about the following:
>> Add one more parameter to __builtin_dynamic_object_size(), i.e
>> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
>> When we see the structure field has counted_by attribute.
> 
> Or maybe add a barrier preventing any assignments to array_annotated->foo 
> from being reordered below the __bdos call? Basically an __asm__ with 
> array_annotated->foo in the clobber list ought to do it I think.
 
 Maybe just adding the array_annotated->foo to the use list of the call to 
 __builtin_dynamic_object_size should be enough?
 
 But I am not sure how to implement this in the TREE level, is there a 
 USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
 counted_by field “array_annotated->foo” to the USE_LIST of the call to 
 __bdos?
 
 This might be the simplest solution?
>>> 
>>> If the dynamic object size is derived of a field then I think you need to
>>> put the "load" of that memory location at the point (as argument)
>>> of the __bos call right at parsing time.  I know that's awkward because
>>> you try to play tricks "discovering" that field only late, but that's not
>>> going to work.
>> 
>> Is it better to do this at gimplification phase instead of FE? 
>> 
>> VLA decls are handled in gimplification phase, the size calculation and call 
>> to alloca are all generated during this phase. (gimplify_vla_decl).
>> 
>> For __bdos calls, we can add an additional argument if the object’s first 
>> argument’s type include the counted_by attribute, i.e
>> 
>> ***During gimplification, 
>> For a call to __builtin_dynamic_object_size (ptr, type)
>> Check whether the type of ptr includes counted_by attribute, if so, change 
>> the call to
>> __builtin_dynamic_object_size (ptr, type, counted_by field)
>> 
>> Then the correct data dependence should be represented well in the IR.
>> 
>> **During object size phase,
>> 
>> The call to __builtin_dynamic_object_size will become an expression includes 
>> the counted_by field or -1/0 when we cannot decide the size, the correct 
>> data dependence will be kept even the call to __builtin_dynamic_object_size 
>> is gone. 
> 
> But the whole point of the BOS pass is to derive information that is not 
> available at parsing time, and that’s the cases you are after.  The case 
> where the connection to the field with the length is apparent during parsing 
> is easy - you simply insert a load of the value before the BOS call.

Yes, this is true. 
I prefer to implement this in gimplification phase since I am more familiar 
with the code there.. (I think that implementing it in gimplification should be 
very similar as implementing it in FE? Or do I miss anything here?)

Joseph, if implement this in FE, where in the FE I should look at? 

Thanks a lot for the help.

Qing

>  For the late case there’s no way to invent data flow dependence without 
> inadvertently pessimizing optimization.
> 
> Richard 
> 
>> 
>>> 
>>> A related issue is that assignment to the field and storage allocation
>>> are not tied together
>> 
>> Yes, this is different from VLA, in which, the size assignment and the 
>> storage allocation are generated and tied together by the compiler.
>> 
>> For the flexible array member, the storage allocation and the size 
>> assignment are all done by the user. So, We need to clarify such requirement 
>>  in the document to guide user to write correct code.  And also, we might 
>> need to provide tools (warnings and sanitizer option) to help users to catch 
>> such coding error.
>> 
>>> - if there's no use of the size data we might
>>> remove the store of it as dead.
>> 
>> Yes, when __bdos cannot decide the size, we need to remove the dead store to 
>> the field.
>> I guess that the compiler should be able to do this automatically?
>> 
>> thanks.
>> 
>> Qing
>>> 
>>> Of course I guess __bos then behaves like sizeof ().
>>> 
>>> Richard.
>>> 
 
 Qing
 
> 
> It may not work for something like this though:
> 
> static size_t
> get_size_of (void *ptr)
> {
> return __bdos (ptr, 1);
> }
> 
> void
> foo (size_t sz)
> {
> array_annotated = __builtin_malloc (sz);
> array_annotated = sz;
> 
> ...
> __builtin_printf ("%zu\n", get_size_of (array_annotated->foo));
> ...
> }
> 
> because the call to get_size_of () may not have been inlined that early.
> 
> The more fool-proof alternative may be to put a compile time barrier 
> right below the assignment to 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Richard Biener



> Am 23.10.2023 um 16:56 schrieb Qing Zhao :
> 
> 
> 
>> On Oct 23, 2023, at 3:57 AM, Richard Biener  
>> wrote:
>> 
>>> On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:
>>> 
>>> 
>>> 
 On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  
 wrote:
 
 On 2023-10-20 14:38, Qing Zhao wrote:
> How about the following:
> Add one more parameter to __builtin_dynamic_object_size(), i.e
> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
> When we see the structure field has counted_by attribute.
 
 Or maybe add a barrier preventing any assignments to array_annotated->foo 
 from being reordered below the __bdos call? Basically an __asm__ with 
 array_annotated->foo in the clobber list ought to do it I think.
>>> 
>>> Maybe just adding the array_annotated->foo to the use list of the call to 
>>> __builtin_dynamic_object_size should be enough?
>>> 
>>> But I am not sure how to implement this in the TREE level, is there a 
>>> USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
>>> counted_by field “array_annotated->foo” to the USE_LIST of the call to 
>>> __bdos?
>>> 
>>> This might be the simplest solution?
>> 
>> If the dynamic object size is derived of a field then I think you need to
>> put the "load" of that memory location at the point (as argument)
>> of the __bos call right at parsing time.  I know that's awkward because
>> you try to play tricks "discovering" that field only late, but that's not
>> going to work.
> 
> Is it better to do this at gimplification phase instead of FE? 
> 
> VLA decls are handled in gimplification phase, the size calculation and call 
> to alloca are all generated during this phase. (gimplify_vla_decl).
> 
> For __bdos calls, we can add an additional argument if the object’s first 
> argument’s type include the counted_by attribute, i.e
> 
> ***During gimplification, 
> For a call to __builtin_dynamic_object_size (ptr, type)
> Check whether the type of ptr includes counted_by attribute, if so, change 
> the call to
> __builtin_dynamic_object_size (ptr, type, counted_by field)
> 
> Then the correct data dependence should be represented well in the IR.
> 
> **During object size phase,
> 
> The call to __builtin_dynamic_object_size will become an expression includes 
> the counted_by field or -1/0 when we cannot decide the size, the correct data 
> dependence will be kept even the call to __builtin_dynamic_object_size is 
> gone. 

But the whole point of the BOS pass is to derive information that is not 
available at parsing time, and that’s the cases you are after.  The case where 
the connection to the field with the length is apparent during parsing is easy 
- you simply insert a load of the value before the BOS call.  For the late case 
there’s no way to invent data flow dependence without inadvertently pessimizing 
optimization.

Richard 

> 
>> 
>> A related issue is that assignment to the field and storage allocation
>> are not tied together
> 
> Yes, this is different from VLA, in which, the size assignment and the 
> storage allocation are generated and tied together by the compiler.
> 
> For the flexible array member, the storage allocation and the size assignment 
> are all done by the user. So, We need to clarify such requirement  in the 
> document to guide user to write correct code.  And also, we might need to 
> provide tools (warnings and sanitizer option) to help users to catch such 
> coding error.
> 
>> - if there's no use of the size data we might
>> remove the store of it as dead.
> 
> Yes, when __bdos cannot decide the size, we need to remove the dead store to 
> the field.
> I guess that the compiler should be able to do this automatically?
> 
> thanks.
> 
> Qing
>> 
>> Of course I guess __bos then behaves like sizeof ().
>> 
>> Richard.
>> 
>>> 
>>> Qing
>>> 
 
 It may not work for something like this though:
 
 static size_t
 get_size_of (void *ptr)
 {
 return __bdos (ptr, 1);
 }
 
 void
 foo (size_t sz)
 {
 array_annotated = __builtin_malloc (sz);
 array_annotated = sz;
 
 ...
 __builtin_printf ("%zu\n", get_size_of (array_annotated->foo));
 ...
 }
 
 because the call to get_size_of () may not have been inlined that early.
 
 The more fool-proof alternative may be to put a compile time barrier right 
 below the assignment to array_annotated->foo; I reckon you could do that 
 early in the front end by marking the size identifier and then tracking 
 assignments to that identifier.  That may have a slight runtime 
 performance overhead since it may prevent even legitimate reordering.  I 
 can't think of another alternative at the moment...
 
 Sid
> 


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 8:34 AM, Richard Biener  
> wrote:
> 
> On Mon, Oct 23, 2023 at 1:27 PM Siddhesh Poyarekar  
> wrote:
>> 
>> On 2023-10-23 03:57, Richard Biener wrote:
>>> On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:
 
 
 
> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  
> wrote:
> 
> On 2023-10-20 14:38, Qing Zhao wrote:
>> How about the following:
>>   Add one more parameter to __builtin_dynamic_object_size(), i.e
>> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
>> When we see the structure field has counted_by attribute.
> 
> Or maybe add a barrier preventing any assignments to array_annotated->foo 
> from being reordered below the __bdos call? Basically an __asm__ with 
> array_annotated->foo in the clobber list ought to do it I think.
 
 Maybe just adding the array_annotated->foo to the use list of the call to 
 __builtin_dynamic_object_size should be enough?
 
 But I am not sure how to implement this in the TREE level, is there a 
 USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
 counted_by field “array_annotated->foo” to the USE_LIST of the call to 
 __bdos?
 
 This might be the simplest solution?
>>> 
>>> If the dynamic object size is derived of a field then I think you need to
>>> put the "load" of that memory location at the point (as argument)
>>> of the __bos call right at parsing time.  I know that's awkward because
>>> you try to play tricks "discovering" that field only late, but that's not
>>> going to work.
>>> 
>>> A related issue is that assignment to the field and storage allocation
>>> are not tied together - if there's no use of the size data we might
>>> remove the store of it as dead.
>> 
>> Maybe the trick then is to treat the size data as volatile?  That ought
>> to discourage reordering and also prevent elimination of the "dead" store?
> 
> But we are an optimizing compiler, not a static analysis machine, so I
> fail to see how this is a useful suggestion.
> 
> I think Martins suggestion to approach this as a language extension
> is more useful and would make it easier to handle this?

I agree that making this as a language extension is a better and cleaner 
approach.

As we discussed before, the major issues with the language extension approach 
are:
1. Harder to be adopted by the existing source code due to the potential 
ABI/API change.
2. Much more effort and much longer time to be accepted.

In addition to the above issues, I guess the same issue exists even with a 
language extension, 
Since for FMA, it’s the user (not the compiler) to allocate the storage for the 
FMA. (Should we 
Also move this into compiler for the language extension? Then the existing 
source code need to
Be changed a lot to adopt the new language extension).

As a result, the size  and the storage allocation cannot be guaranteed to be 
tied together too.

Qing

> 
> Richard.
> 
>> Thanks,
>> Sid



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Qing Zhao


> On Oct 23, 2023, at 3:57 AM, Richard Biener  
> wrote:
> 
> On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  wrote:
>>> 
>>> On 2023-10-20 14:38, Qing Zhao wrote:
 How about the following:
  Add one more parameter to __builtin_dynamic_object_size(), i.e
 __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
 When we see the structure field has counted_by attribute.
>>> 
>>> Or maybe add a barrier preventing any assignments to array_annotated->foo 
>>> from being reordered below the __bdos call? Basically an __asm__ with 
>>> array_annotated->foo in the clobber list ought to do it I think.
>> 
>> Maybe just adding the array_annotated->foo to the use list of the call to 
>> __builtin_dynamic_object_size should be enough?
>> 
>> But I am not sure how to implement this in the TREE level, is there a 
>> USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
>> counted_by field “array_annotated->foo” to the USE_LIST of the call to 
>> __bdos?
>> 
>> This might be the simplest solution?
> 
> If the dynamic object size is derived of a field then I think you need to
> put the "load" of that memory location at the point (as argument)
> of the __bos call right at parsing time.  I know that's awkward because
> you try to play tricks "discovering" that field only late, but that's not
> going to work.

Is it better to do this at gimplification phase instead of FE? 

VLA decls are handled in gimplification phase, the size calculation and call to 
alloca are all generated during this phase. (gimplify_vla_decl).

For __bdos calls, we can add an additional argument if the object’s first 
argument’s type include the counted_by attribute, i.e

***During gimplification, 
For a call to __builtin_dynamic_object_size (ptr, type)
Check whether the type of ptr includes counted_by attribute, if so, change the 
call to
__builtin_dynamic_object_size (ptr, type, counted_by field)

Then the correct data dependence should be represented well in the IR.

**During object size phase,

The call to __builtin_dynamic_object_size will become an expression includes 
the counted_by field or -1/0 when we cannot decide the size, the correct data 
dependence will be kept even the call to __builtin_dynamic_object_size is gone. 


> 
> A related issue is that assignment to the field and storage allocation
> are not tied together

Yes, this is different from VLA, in which, the size assignment and the storage 
allocation are generated and tied together by the compiler.

For the flexible array member, the storage allocation and the size assignment 
are all done by the user. So, We need to clarify such requirement  in the 
document to guide user to write correct code.  And also, we might need to 
provide tools (warnings and sanitizer option) to help users to catch such 
coding error.

> - if there's no use of the size data we might
> remove the store of it as dead.

Yes, when __bdos cannot decide the size, we need to remove the dead store to 
the field.
I guess that the compiler should be able to do this automatically?

thanks.

Qing
> 
> Of course I guess __bos then behaves like sizeof ().
> 
> Richard.
> 
>> 
>> Qing
>> 
>>> 
>>> It may not work for something like this though:
>>> 
>>> static size_t
>>> get_size_of (void *ptr)
>>> {
>>> return __bdos (ptr, 1);
>>> }
>>> 
>>> void
>>> foo (size_t sz)
>>> {
>>> array_annotated = __builtin_malloc (sz);
>>> array_annotated = sz;
>>> 
>>> ...
>>> __builtin_printf ("%zu\n", get_size_of (array_annotated->foo));
>>> ...
>>> }
>>> 
>>> because the call to get_size_of () may not have been inlined that early.
>>> 
>>> The more fool-proof alternative may be to put a compile time barrier right 
>>> below the assignment to array_annotated->foo; I reckon you could do that 
>>> early in the front end by marking the size identifier and then tracking 
>>> assignments to that identifier.  That may have a slight runtime performance 
>>> overhead since it may prevent even legitimate reordering.  I can't think of 
>>> another alternative at the moment...
>>> 
>>> Sid



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Siddhesh Poyarekar

On 2023-10-23 08:34, Richard Biener wrote:

A related issue is that assignment to the field and storage allocation
are not tied together - if there's no use of the size data we might
remove the store of it as dead.


Maybe the trick then is to treat the size data as volatile?  That ought
to discourage reordering and also prevent elimination of the "dead" store?


But we are an optimizing compiler, not a static analysis machine, so I
fail to see how this is a useful suggestion.


Sorry I didn't meant to suggest doing this in the middle-end.


I think Martins suggestion to approach this as a language extension
is more useful and would make it easier to handle this?


I think handling for this (e.g. treating any storage allocated for the 
size member in the struct as volatile to prevent reordering or 
elimination) would have to be implemented in the front-end, regardless 
of whether it is a language extension or as a gcc attribute.  How would 
making it a language extension vs a gcc attribute make it different?


Thanks,
Sid


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Richard Biener
On Mon, Oct 23, 2023 at 1:27 PM Siddhesh Poyarekar  wrote:
>
> On 2023-10-23 03:57, Richard Biener wrote:
> > On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:
> >>
> >>
> >>
> >>> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  
> >>> wrote:
> >>>
> >>> On 2023-10-20 14:38, Qing Zhao wrote:
>  How about the following:
> Add one more parameter to __builtin_dynamic_object_size(), i.e
>  __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
>  When we see the structure field has counted_by attribute.
> >>>
> >>> Or maybe add a barrier preventing any assignments to array_annotated->foo 
> >>> from being reordered below the __bdos call? Basically an __asm__ with 
> >>> array_annotated->foo in the clobber list ought to do it I think.
> >>
> >> Maybe just adding the array_annotated->foo to the use list of the call to 
> >> __builtin_dynamic_object_size should be enough?
> >>
> >> But I am not sure how to implement this in the TREE level, is there a 
> >> USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
> >> counted_by field “array_annotated->foo” to the USE_LIST of the call to 
> >> __bdos?
> >>
> >> This might be the simplest solution?
> >
> > If the dynamic object size is derived of a field then I think you need to
> > put the "load" of that memory location at the point (as argument)
> > of the __bos call right at parsing time.  I know that's awkward because
> > you try to play tricks "discovering" that field only late, but that's not
> > going to work.
> >
> > A related issue is that assignment to the field and storage allocation
> > are not tied together - if there's no use of the size data we might
> > remove the store of it as dead.
>
> Maybe the trick then is to treat the size data as volatile?  That ought
> to discourage reordering and also prevent elimination of the "dead" store?

But we are an optimizing compiler, not a static analysis machine, so I
fail to see how this is a useful suggestion.

I think Martins suggestion to approach this as a language extension
is more useful and would make it easier to handle this?

Richard.

> Thanks,
> Sid


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Siddhesh Poyarekar

On 2023-10-23 03:57, Richard Biener wrote:

On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:





On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  wrote:

On 2023-10-20 14:38, Qing Zhao wrote:

How about the following:
   Add one more parameter to __builtin_dynamic_object_size(), i.e
__builtin_dynamic_object_size (_1,1,array_annotated->foo)?
When we see the structure field has counted_by attribute.


Or maybe add a barrier preventing any assignments to array_annotated->foo from 
being reordered below the __bdos call? Basically an __asm__ with 
array_annotated->foo in the clobber list ought to do it I think.


Maybe just adding the array_annotated->foo to the use list of the call to 
__builtin_dynamic_object_size should be enough?

But I am not sure how to implement this in the TREE level, is there a 
USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the counted_by 
field “array_annotated->foo” to the USE_LIST of the call to __bdos?

This might be the simplest solution?


If the dynamic object size is derived of a field then I think you need to
put the "load" of that memory location at the point (as argument)
of the __bos call right at parsing time.  I know that's awkward because
you try to play tricks "discovering" that field only late, but that's not
going to work.

A related issue is that assignment to the field and storage allocation
are not tied together - if there's no use of the size data we might
remove the store of it as dead.


Maybe the trick then is to treat the size data as volatile?  That ought 
to discourage reordering and also prevent elimination of the "dead" store?


Thanks,
Sid


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-23 Thread Richard Biener
On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao  wrote:
>
>
>
> > On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  wrote:
> >
> > On 2023-10-20 14:38, Qing Zhao wrote:
> >> How about the following:
> >>   Add one more parameter to __builtin_dynamic_object_size(), i.e
> >> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
> >> When we see the structure field has counted_by attribute.
> >
> > Or maybe add a barrier preventing any assignments to array_annotated->foo 
> > from being reordered below the __bdos call? Basically an __asm__ with 
> > array_annotated->foo in the clobber list ought to do it I think.
>
> Maybe just adding the array_annotated->foo to the use list of the call to 
> __builtin_dynamic_object_size should be enough?
>
> But I am not sure how to implement this in the TREE level, is there a 
> USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the 
> counted_by field “array_annotated->foo” to the USE_LIST of the call to __bdos?
>
> This might be the simplest solution?

If the dynamic object size is derived of a field then I think you need to
put the "load" of that memory location at the point (as argument)
of the __bos call right at parsing time.  I know that's awkward because
you try to play tricks "discovering" that field only late, but that's not
going to work.

A related issue is that assignment to the field and storage allocation
are not tied together - if there's no use of the size data we might
remove the store of it as dead.

Of course I guess __bos then behaves like sizeof ().

Richard.

>
> Qing
>
> >
> > It may not work for something like this though:
> >
> > static size_t
> > get_size_of (void *ptr)
> > {
> >  return __bdos (ptr, 1);
> > }
> >
> > void
> > foo (size_t sz)
> > {
> >  array_annotated = __builtin_malloc (sz);
> >  array_annotated = sz;
> >
> >  ...
> >  __builtin_printf ("%zu\n", get_size_of (array_annotated->foo));
> >  ...
> > }
> >
> > because the call to get_size_of () may not have been inlined that early.
> >
> > The more fool-proof alternative may be to put a compile time barrier right 
> > below the assignment to array_annotated->foo; I reckon you could do that 
> > early in the front end by marking the size identifier and then tracking 
> > assignments to that identifier.  That may have a slight runtime performance 
> > overhead since it may prevent even legitimate reordering.  I can't think of 
> > another alternative at the moment...
> >
> > Sid
>


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-20 Thread Qing Zhao


> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-10-20 14:38, Qing Zhao wrote:
>> How about the following:
>>   Add one more parameter to __builtin_dynamic_object_size(), i.e
>> __builtin_dynamic_object_size (_1,1,array_annotated->foo)?
>> When we see the structure field has counted_by attribute.
> 
> Or maybe add a barrier preventing any assignments to array_annotated->foo 
> from being reordered below the __bdos call? Basically an __asm__ with 
> array_annotated->foo in the clobber list ought to do it I think.

Maybe just adding the array_annotated->foo to the use list of the call to 
__builtin_dynamic_object_size should be enough?

But I am not sure how to implement this in the TREE level, is there a 
USE_LIST/CLOBBER_LIST for each call?  Then I can just simply add the counted_by 
field “array_annotated->foo” to the USE_LIST of the call to __bdos?

This might be the simplest solution?

Qing

> 
> It may not work for something like this though:
> 
> static size_t
> get_size_of (void *ptr)
> {
>  return __bdos (ptr, 1);
> }
> 
> void
> foo (size_t sz)
> {
>  array_annotated = __builtin_malloc (sz);
>  array_annotated = sz;
> 
>  ...
>  __builtin_printf ("%zu\n", get_size_of (array_annotated->foo));
>  ...
> }
> 
> because the call to get_size_of () may not have been inlined that early.
> 
> The more fool-proof alternative may be to put a compile time barrier right 
> below the assignment to array_annotated->foo; I reckon you could do that 
> early in the front end by marking the size identifier and then tracking 
> assignments to that identifier.  That may have a slight runtime performance 
> overhead since it may prevent even legitimate reordering.  I can't think of 
> another alternative at the moment...
> 
> Sid



Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-20 Thread Siddhesh Poyarekar

On 2023-10-20 14:38, Qing Zhao wrote:

How about the following:

   Add one more parameter to __builtin_dynamic_object_size(), i.e

__builtin_dynamic_object_size (_1,1,array_annotated->foo)?

When we see the structure field has counted_by attribute.


Or maybe add a barrier preventing any assignments to 
array_annotated->foo from being reordered below the __bdos call? 
Basically an __asm__ with array_annotated->foo in the clobber list ought 
to do it I think.


It may not work for something like this though:

static size_t
get_size_of (void *ptr)
{
  return __bdos (ptr, 1);
}

void
foo (size_t sz)
{
  array_annotated = __builtin_malloc (sz);
  array_annotated = sz;

  ...
  __builtin_printf ("%zu\n", get_size_of (array_annotated->foo));
  ...
}

because the call to get_size_of () may not have been inlined that early.

The more fool-proof alternative may be to put a compile time barrier 
right below the assignment to array_annotated->foo; I reckon you could 
do that early in the front end by marking the size identifier and then 
tracking assignments to that identifier.  That may have a slight runtime 
performance overhead since it may prevent even legitimate reordering.  I 
can't think of another alternative at the moment...


Sid


Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-20 Thread Qing Zhao


> On Oct 20, 2023, at 2:22 PM, Richard Biener  
> wrote:
> 
> 
> 
>> Am 20.10.2023 um 19:09 schrieb Qing Zhao :
>> 
>> Sid,
>> 
>> (Richard, can you please help me to make sure this? Thanks a lot)
>> 
>> I studied a little bit more on the following question you raised during the 
>> review process:
>> 
>> For the following small testing case: 
>> 
>> 1 struct annotated {
>> 2   int foo;
>> 3   char array[] __attribute__((counted_by (foo)));
>> 4 };
>> 5 
>> 6 extern struct annotated * alloc_buf (int);
>> 7 
>> 8 int test (int sz)
>> 9 {
>> 10   struct annotated * array_annotated = alloc_buf (sz);
>> 11   array_annotated->foo = sz;
>> 12   return __builtin_dynamic_object_size (array_annotated->array, 1);
>> 13 }
>> 
>> Whether the assignment of the size to the counted_by field at line 11 and 
>> the consumer of the size at line 12 at call to __bdos might be reordered by 
>> GCC? 
>> 
>> The following is my thought:
>> 
>> 1. _bdos computation passes (both pass_early_object_sizes and 
>> pass_object_sizes) are in the early stage of SSA optimizations. In which, 
>> pass_early_object_sizes happens before almost all the optimizations, no 
>> reordering is possible in this pass;
>> 
>> 2. Then how about the pass “pass_object_sizes”?
>> 
>>  Immediately after the pass_build_ssa,  the IR for the routine “test” is  
>> with the SSA form: (compiled with -O3):
>> 
>> 1 int test (int sz)
>> 2 {
>> 3   struct annotated * array_annotated;
>> 4   char[0:] * _1;
>> 5   long unsigned int _2;
>> 6   int _8;
>> 7   
>> 8:
>> 9   array_annotated_6 = alloc_buf (sz_4(D));
>> 10   array_annotated_6->foo = sz_4(D);
>> 11   _1 = _annotated_6->array;
>> 12   _2 = __builtin_dynamic_object_size (_1, 1);
>> 13   _8 = (int) _2;
>> 14   return _8; 
>> 15 } 
>> 
>> In the above IR, the key portion is line 10 and line 11: (whether these two 
>> lines might be reordered with SSA optimization?)
>> 
>> 10   array_annotated_6->foo = sz_4(D);
>> 11   _1 = _annotated_6->array;
>> 
>> The major question here is: whether the SSA optimizations are able to 
>> distinguish the object “array_annotated_6->foo” at line 10 is independent 
>> with
>> the object “array_annotated-_6->array” at line 11?
>> 
>> If the SSA optimizations can distinguish “array_annotated_6->foo” from 
>> “array_annotated_6->array”, then these two lines might be reordered.
>> Otherwise, these two lines will not be reordered by SSA optimizations.
>> 
>> I am not very familiar with the details of the SSA optimizations, but my 
>> guess is, two fields of the same structure might not be distinguished by the 
>> SSA optimizations, then line 10 and line 11 will not be reordered by SSA 
>> optimizations.
>> 
>> Richard, is my guess correct?
> 
> There is no data dependence between the memory access and the address 
> computation so nothing prevents the reordering.  

Okay, I see.  then:

10   array_annotated_6->foo = sz_4(D);
11   _1 = _annotated_6->array;

Line 10 and line 11 could be reordered.

And then
10   array_annotated_6->foo = sz_4(D);
12   _2 = __builtin_dynamic_object_size (_1, 1);

Line 10 and 12 could be reordered too.

Then what’s the best way to add such data dependence in the IR?

How about the following:

  Add one more parameter to __builtin_dynamic_object_size(), i.e 

__builtin_dynamic_object_size (_1,1,array_annotated->foo)? 

When we see the structure field has counted_by attribute. 

Then we can enforce such data dependence and avoid potential reordering.

What’s your opinion? Do you have other suggestion on the solution?

Qing



If you put another same bos call before the access I expect the addresses to be 
CSEd, effectively moving the later before the access.
> 
> Richard 
> 
>> Thanks a lot for your help.
>> 
>> Qing
>> 
> On Oct 5, 2023, at 4:08 PM, Siddhesh Poyarekar  
> wrote:
 
 I hope the review was helpful.  Overall, a couple of things to consider:
 
 1. How would you handle potential reordering between assignment of the 
 size to the counted_by field with the __bdos call that may consume it? 
 You'll probably need to express some kind of dependency there or in the 
 worst case, insert a barrier to disallow reordering.
>>> 
>>> Good point! 
>>> 
>>> So, your example in the respond to [V3][PATCH 2/3]Use the counted_by 
>>> atribute info in builtin object size [PR108896]:
>>> “
>>> Maybe another test where the allocation, size assignment and __bdos call 
>>> happen in the same function, where the allocator is not recognized by gcc:
>>> 
>>> void *
>>> __attribute__ ((noinline))
>>> alloc (size_t sz)
>>> {
>>> return __builtin_malloc (sz);
>>> }
>>> 
>>> void test (size_t sz)
>>> {
>>> array_annotated = alloc (sz);
>>> array_annotated->b = sz;
>>> return __builtin_dynamic_object_size (array_annotated->c, 1);
>>> }
>>> 
>>> The interesting thing to test (and ensure in the codegen) is that the 
>>> assignment to array_annotated->b does not get reordered to below the 
>>> __builtin_dynamic_object_size call since 

Re: HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-20 Thread Richard Biener



> Am 20.10.2023 um 19:09 schrieb Qing Zhao :
> 
> Sid,
> 
> (Richard, can you please help me to make sure this? Thanks a lot)
> 
> I studied a little bit more on the following question you raised during the 
> review process:
> 
> For the following small testing case: 
> 
>  1 struct annotated {
>  2   int foo;
>  3   char array[] __attribute__((counted_by (foo)));
>  4 };
>  5 
>  6 extern struct annotated * alloc_buf (int);
>  7 
>  8 int test (int sz)
>  9 {
> 10   struct annotated * array_annotated = alloc_buf (sz);
> 11   array_annotated->foo = sz;
> 12   return __builtin_dynamic_object_size (array_annotated->array, 1);
> 13 }
> 
> Whether the assignment of the size to the counted_by field at line 11 and the 
> consumer of the size at line 12 at call to __bdos might be reordered by GCC? 
> 
> The following is my thought:
> 
> 1. _bdos computation passes (both pass_early_object_sizes and 
> pass_object_sizes) are in the early stage of SSA optimizations. In which, 
> pass_early_object_sizes happens before almost all the optimizations, no 
> reordering is possible in this pass;
> 
> 2. Then how about the pass “pass_object_sizes”?
> 
>   Immediately after the pass_build_ssa,  the IR for the routine “test” is  
> with the SSA form: (compiled with -O3):
> 
>  1 int test (int sz)
>  2 {
>  3   struct annotated * array_annotated;
>  4   char[0:] * _1;
>  5   long unsigned int _2;
>  6   int _8;
>  7   
>  8:
>  9   array_annotated_6 = alloc_buf (sz_4(D));
> 10   array_annotated_6->foo = sz_4(D);
> 11   _1 = _annotated_6->array;
> 12   _2 = __builtin_dynamic_object_size (_1, 1);
> 13   _8 = (int) _2;
> 14   return _8; 
> 15 } 
> 
> In the above IR, the key portion is line 10 and line 11: (whether these two 
> lines might be reordered with SSA optimization?)
> 
> 10   array_annotated_6->foo = sz_4(D);
> 11   _1 = _annotated_6->array;
> 
> The major question here is: whether the SSA optimizations are able to 
> distinguish the object “array_annotated_6->foo” at line 10 is independent with
> the object “array_annotated-_6->array” at line 11?
> 
> If the SSA optimizations can distinguish “array_annotated_6->foo” from 
> “array_annotated_6->array”, then these two lines might be reordered.
> Otherwise, these two lines will not be reordered by SSA optimizations.
> 
> I am not very familiar with the details of the SSA optimizations, but my 
> guess is, two fields of the same structure might not be distinguished by the 
> SSA optimizations, then line 10 and line 11 will not be reordered by SSA 
> optimizations.
> 
> Richard, is my guess correct?

There is no data dependence between the memory access and the address 
computation so nothing prevents the reordering.  If you put another same bos 
call before the access I expect the addresses to be CSEd, effectively moving 
the later before the access.

Richard 

> Thanks a lot for your help.
> 
> Qing
> 
 On Oct 5, 2023, at 4:08 PM, Siddhesh Poyarekar  wrote:
>>> 
>>> I hope the review was helpful.  Overall, a couple of things to consider:
>>> 
>>> 1. How would you handle potential reordering between assignment of the size 
>>> to the counted_by field with the __bdos call that may consume it? You'll 
>>> probably need to express some kind of dependency there or in the worst 
>>> case, insert a barrier to disallow reordering.
>> 
>> Good point! 
>> 
>> So, your example in the respond to [V3][PATCH 2/3]Use the counted_by 
>> atribute info in builtin object size [PR108896]:
>> “
>> Maybe another test where the allocation, size assignment and __bdos call 
>> happen in the same function, where the allocator is not recognized by gcc:
>> 
>> void *
>> __attribute__ ((noinline))
>> alloc (size_t sz)
>> {
>> return __builtin_malloc (sz);
>> }
>> 
>> void test (size_t sz)
>> {
>> array_annotated = alloc (sz);
>> array_annotated->b = sz;
>> return __builtin_dynamic_object_size (array_annotated->c, 1);
>> }
>> 
>> The interesting thing to test (and ensure in the codegen) is that the 
>> assignment to array_annotated->b does not get reordered to below the 
>> __builtin_dynamic_object_size call since technically there is no data 
>> dependency between the two.
>> “
>> Will test on this. 
>> 
>> Not sure whether the current GCC alias analysis is able to distinguish one 
>> field of a structure from another field of the same structure, if YES, then
>> We need to add an explicit dependency edge from the write to 
>> “array_annotated->b” to the call to 
>> “__builtin_dynamic_object_size(array_annotated->c,1)”.
>> I will check on this and see how to resolve this issue.
>> 
>> I guess the possible solution is that we can add an implicit ref to 
>> “array_annotated->b” at the call to 
>> “__builtin_dynamic_object_size(array_annotated->c, 1)” if the counted_by 
>> attribute is available. That should resolve the issue.
>> 
>> Richard, what do you think on this?
>> 
> 


HELP: Will the reordering happen? Re: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-10-20 Thread Qing Zhao
Sid,

(Richard, can you please help me to make sure this? Thanks a lot)

I studied a little bit more on the following question you raised during the 
review process:

For the following small testing case: 

  1 struct annotated {
  2   int foo;
  3   char array[] __attribute__((counted_by (foo)));
  4 };
  5 
  6 extern struct annotated * alloc_buf (int);
  7 
  8 int test (int sz)
  9 {
 10   struct annotated * array_annotated = alloc_buf (sz);
 11   array_annotated->foo = sz;
 12   return __builtin_dynamic_object_size (array_annotated->array, 1);
 13 }

Whether the assignment of the size to the counted_by field at line 11 and the 
consumer of the size at line 12 at call to __bdos might be reordered by GCC? 

The following is my thought:

1. _bdos computation passes (both pass_early_object_sizes and 
pass_object_sizes) are in the early stage of SSA optimizations. In which, 
pass_early_object_sizes happens before almost all the optimizations, no 
reordering is possible in this pass;

2. Then how about the pass “pass_object_sizes”?

   Immediately after the pass_build_ssa,  the IR for the routine “test” is  
with the SSA form: (compiled with -O3):

  1 int test (int sz)
  2 {
  3   struct annotated * array_annotated;
  4   char[0:] * _1;
  5   long unsigned int _2;
  6   int _8;
  7   
  8:
  9   array_annotated_6 = alloc_buf (sz_4(D));
 10   array_annotated_6->foo = sz_4(D);
 11   _1 = _annotated_6->array;
 12   _2 = __builtin_dynamic_object_size (_1, 1);
 13   _8 = (int) _2;
 14   return _8; 
 15 } 

In the above IR, the key portion is line 10 and line 11: (whether these two 
lines might be reordered with SSA optimization?)

 10   array_annotated_6->foo = sz_4(D);
 11   _1 = _annotated_6->array;

The major question here is: whether the SSA optimizations are able to 
distinguish the object “array_annotated_6->foo” at line 10 is independent with
the object “array_annotated-_6->array” at line 11?

If the SSA optimizations can distinguish “array_annotated_6->foo” from 
“array_annotated_6->array”, then these two lines might be reordered.
Otherwise, these two lines will not be reordered by SSA optimizations.

I am not very familiar with the details of the SSA optimizations, but my guess 
is, two fields of the same structure might not be distinguished by the SSA 
optimizations, then line 10 and line 11 will not be reordered by SSA 
optimizations.

Richard, is my guess correct?

Thanks a lot for your help.

Qing

>> On Oct 5, 2023, at 4:08 PM, Siddhesh Poyarekar  wrote:
>> 
>> I hope the review was helpful.  Overall, a couple of things to consider:
>> 
>> 1. How would you handle potential reordering between assignment of the size 
>> to the counted_by field with the __bdos call that may consume it? You'll 
>> probably need to express some kind of dependency there or in the worst case, 
>> insert a barrier to disallow reordering.
> 
> Good point! 
> 
> So, your example in the respond to [V3][PATCH 2/3]Use the counted_by atribute 
> info in builtin object size [PR108896]:
> “
> Maybe another test where the allocation, size assignment and __bdos call 
> happen in the same function, where the allocator is not recognized by gcc:
> 
> void *
> __attribute__ ((noinline))
> alloc (size_t sz)
> {
> return __builtin_malloc (sz);
> }
> 
> void test (size_t sz)
> {
> array_annotated = alloc (sz);
> array_annotated->b = sz;
> return __builtin_dynamic_object_size (array_annotated->c, 1);
> }
> 
> The interesting thing to test (and ensure in the codegen) is that the 
> assignment to array_annotated->b does not get reordered to below the 
> __builtin_dynamic_object_size call since technically there is no data 
> dependency between the two.
> “
> Will test on this. 
> 
> Not sure whether the current GCC alias analysis is able to distinguish one 
> field of a structure from another field of the same structure, if YES, then
> We need to add an explicit dependency edge from the write to 
> “array_annotated->b” to the call to 
> “__builtin_dynamic_object_size(array_annotated->c,1)”.
> I will check on this and see how to resolve this issue.
> 
> I guess the possible solution is that we can add an implicit ref to 
> “array_annotated->b” at the call to 
> “__builtin_dynamic_object_size(array_annotated->c, 1)” if the counted_by 
> attribute is available. That should resolve the issue.
> 
> Richard, what do you think on this?
>