Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Qing Zhao via Gcc-patches


> On Aug 4, 2023, at 3:09 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-08-04 15:06, Qing Zhao wrote:
>>> Yes, that's what I'm thinking.
>>> 
> so `q` must be pointing to a single element.  So you could deduce:
> 
> 1. the minimum size of the whole object that q points to.
 You mean that the TYPE will determine the minimum size of the whole 
 object?  (Does this include the size of the flexible array member, or only 
 the other part of the structure except the flexible array member?)
>>> 
>>> Only the constant sized part of the structure.
>> Okay. I see.
>> But if the “counted_by” info is available, then from p->array, we can deduce 
>> the minimum size too, as sizeof(struct A) + q->foo * sizeof(int), right?
> 
> Yes.
> 
>>> 
> Actually for minimum size we'd also need a guarantee that 
> `alloc_buf_more` returns a valid allocated object.
 Why? Please explain a little bit here.
>>> 
>>> So `alloc_buf_more` could return NULL, a valid pointer or an invalid 
>>> pointer.  So, we could end up returning a non-zero minimum size for an 
>>> invalid or NULL pointer, which is incorrect, we don't know that.
>> I see what’ s you mean now.
>> However, if we already see p->array, then the p is guaranteed a valid 
>> pointer and not a NULL, right?  (We are discussing on 
>> __builtin_dynamic_object_size (q->array, 2), we see q->array already)
> 
> Yes, you could argue that for p->array, I agree, but not for p.

Agreed. Yes, for p->array, observed access. -:)

Looks like we can improve __builtin_dynamic_object_size  for the following case:
struct A
{
 size_t foo;
 int array[] __attribute__((counted_by (foo)));
};

extern struct fix * alloc_buf ();

int main ()
{
 struct fix *p = alloc_buf ();
 __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * sizeof(int); 
  /* with the current algorithm, it’s UNKNOWN */ 
 __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * sizeof(int); 
  /* with the current algorithm, it’s UNKNOWN */
}

I will add this improvement to __builtin_dynamic_object_size for FAM with 
“counted_by” attribute in a later patch after the initial patch is committed.

Thanks a lot for the help.

Qing
> 
>>> 
>>> We won't need the object validity guarantee for (2) beyond, e.g. guarding 
>>> against a new NULL pointer dereference because it's a *maximum* estimate; 
>>> an invalid or NULL pointer would have 0 size.  So for such cases, __bos(q, 
>>> 0) could return
>>> 
>>> sizeof(*q) + (q ? q->foo:0)
>>> 
>>> and __bos(q->array, 0) could be
>>> 
>>> sizeof(*q) + q->foo - offsetof(q, array)
>>> 
>>> There's no need to guard against a dereference in the second case because 
>>> the q->array dereference already assumes that q is valid.
>> q->array should also guarantee that q is a valid pointer for minimum size, 
>> right? Or do I miss anything here?
> 
> Yes.
> 
> Thanks,
> Sid



Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Siddhesh Poyarekar

On 2023-08-04 15:06, Qing Zhao wrote:

Yes, that's what I'm thinking.


so `q` must be pointing to a single element.  So you could deduce:

1. the minimum size of the whole object that q points to.

You mean that the TYPE will determine the minimum size of the whole object?  
(Does this include the size of the flexible array member, or only the other 
part of the structure except the flexible array member?)


Only the constant sized part of the structure.

Okay. I see.
But if the “counted_by” info is available, then from p->array, we can deduce the 
minimum size too, as sizeof(struct A) + q->foo * sizeof(int), right?


Yes.




Actually for minimum size we'd also need a guarantee that `alloc_buf_more` 
returns a valid allocated object.

Why? Please explain a little bit here.


So `alloc_buf_more` could return NULL, a valid pointer or an invalid pointer.  
So, we could end up returning a non-zero minimum size for an invalid or NULL 
pointer, which is incorrect, we don't know that.


I see what’ s you mean now.

However, if we already see p->array, then the p is guaranteed a valid pointer and not 
a NULL, right?  (We are discussing on __builtin_dynamic_object_size (q->array, 2), we 
see q->array already)


Yes, you could argue that for p->array, I agree, but not for p.



We won't need the object validity guarantee for (2) beyond, e.g. guarding 
against a new NULL pointer dereference because it's a *maximum* estimate; an 
invalid or NULL pointer would have 0 size.  So for such cases, __bos(q, 0) 
could return

sizeof(*q) + (q ? q->foo:0)

and __bos(q->array, 0) could be

sizeof(*q) + q->foo - offsetof(q, array)

There's no need to guard against a dereference in the second case because the 
q->array dereference already assumes that q is valid.


q->array should also guarantee that q is a valid pointer for minimum size, 
right? Or do I miss anything here?


Yes.

Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Qing Zhao via Gcc-patches


> On Aug 4, 2023, at 12:36 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-08-04 11:27, Qing Zhao wrote:
>>> On Aug 4, 2023, at 10:40 AM, Siddhesh Poyarekar  wrote:
>>> 
>>> On 2023-08-03 13:34, Qing Zhao wrote:
 One thing I need to point out first is, currently, even for regular fixed 
 size array in the structure,
 We have this same issue, for example:
 #define LENGTH 10
 struct fix {
   size_t foo;
   int array[LENGTH];
 };
 …
 int main ()
 {
   struct fix *p;
   p = alloc_buf_more ();
   expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
   expect(__builtin_object_size(p->array, 0), -1);
 }
 Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for 
 it.
 This is not a special issue for flexible array member.
>>> 
>>> That's fine for fixed arrays at the end of a struct because the "whole 
>>> object" size could be anything; `p` could be pointing to the beginning of 
>>> an array for all we know.  If however `array` is strictly a flex array, 
>>> i.e.:
>>> 
>>> ```
>>> struct A
>>> {
>>>  size_t foo;
>>>  int array[];
>>> };
>>> ```
>>> 
>>> then there's no way in valid C to have an array of `struct fix`,
>> Yes!!   this is exactly the place that makes difference between structures 
>> with fixed arrays and the ones with flexible arrays.
>> With such difference, I guess that using the type of the structure with 
>> flexible array member for p->array to get the size of the whole object p 
>> point to might be reasonable?
> 
> Yes, that's what I'm thinking.
> 
>>> so `q` must be pointing to a single element.  So you could deduce:
>>> 
>>> 1. the minimum size of the whole object that q points to.
>> You mean that the TYPE will determine the minimum size of the whole object?  
>> (Does this include the size of the flexible array member, or only the other 
>> part of the structure except the flexible array member?)
> 
> Only the constant sized part of the structure.
Okay. I see.
But if the “counted_by” info is available, then from p->array, we can deduce 
the minimum size too, as sizeof(struct A) + q->foo * sizeof(int), right?
> 
>>> Actually for minimum size we'd also need a guarantee that `alloc_buf_more` 
>>> returns a valid allocated object.
>> Why? Please explain a little bit here.
> 
> So `alloc_buf_more` could return NULL, a valid pointer or an invalid pointer. 
>  So, we could end up returning a non-zero minimum size for an invalid or NULL 
> pointer, which is incorrect, we don't know that.

I see what’ s you mean now.

However, if we already see p->array, then the p is guaranteed a valid pointer 
and not a NULL, right?  (We are discussing on __builtin_dynamic_object_size 
(q->array, 2), we see q->array already)

> 
> We won't need the object validity guarantee for (2) beyond, e.g. guarding 
> against a new NULL pointer dereference because it's a *maximum* estimate; an 
> invalid or NULL pointer would have 0 size.  So for such cases, __bos(q, 0) 
> could return
> 
> sizeof(*q) + (q ? q->foo:0)
> 
> and __bos(q->array, 0) could be
> 
> sizeof(*q) + q->foo - offsetof(q, array)
> 
> There's no need to guard against a dereference in the second case because the 
> q->array dereference already assumes that q is valid.

q->array should also guarantee that q is a valid pointer for minimum size, 
right? Or do I miss anything here?

thanks.

Qing
> 
>>> 
>>> and
>>> 
>>> 2. if you're able to determine the size of the flex array (through 
>>> __element_count__(foo) for example), you could even determine the maximum 
>>> size of the whole object.
>>> 
>>> For (2) though, you'd break applications that overallocate and then expect 
>>> to be able to use that overallocation despite the space not being reflected 
>>> in the __element_count__.  I think it's a bug in the application and I 
>>> can't see a way for an application to be able to do this in a valid way so 
>>> I'm inclined towards breaking it.
>> Currently, we allow the situation when the allocation size for the whole 
>> object is larger than the value reflected in the “counted_by” attribute (the 
>> old name is __element_count__). But don’t allow the other way around (i.e, 
>> when the allocation size for the whole object is smaller than the value 
>> reflected in the “counted_by” attribute.
> 
> Right, that's going to be the "break".  For underallocation __bos will only 
> end up overestimating the space available, which is not ideal, but won't end 
> up breaking compatibility.
> 
>>> 
>>> Of course, the fact that gcc allows flex arrays to be in the middle of 
>>> structs breaks the base assumption but that's something we need to get rid 
>>> of anyway since there's no way for valid C programs to use that safely.
>> Since GCC14, we started to deprecate this extension (allow flex array to be 
>> in the middle of structs).
>> https://gcc.gnu.org/pipermail/gcc-cvs/2023-June/385730.html
> 
> Yes, that's what I'm banking on.
> 
> Thanks,
> Sid



Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Siddhesh Poyarekar

On 2023-08-04 11:27, Qing Zhao wrote:




On Aug 4, 2023, at 10:40 AM, Siddhesh Poyarekar  wrote:

On 2023-08-03 13:34, Qing Zhao wrote:

One thing I need to point out first is, currently, even for regular fixed size 
array in the structure,
We have this same issue, for example:
#define LENGTH 10
struct fix {
   size_t foo;
   int array[LENGTH];
};
…
int main ()
{
   struct fix *p;
   p = alloc_buf_more ();
   expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
   expect(__builtin_object_size(p->array, 0), -1);
}
Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for it.
This is not a special issue for flexible array member.


That's fine for fixed arrays at the end of a struct because the "whole object" 
size could be anything; `p` could be pointing to the beginning of an array for all we 
know.  If however `array` is strictly a flex array, i.e.:

```
struct A
{
  size_t foo;
  int array[];
};
```

then there's no way in valid C to have an array of `struct fix`,


Yes!!   this is exactly the place that makes difference between structures with 
fixed arrays and the ones with flexible arrays.

With such difference, I guess that using the type of the structure with flexible 
array member for p->array to get the size of the whole object p point to might 
be reasonable?


Yes, that's what I'm thinking.


so `q` must be pointing to a single element.  So you could deduce:

1. the minimum size of the whole object that q points to.


You mean that the TYPE will determine the minimum size of the whole object?  
(Does this include the size of the flexible array member, or only the other 
part of the structure except the flexible array member?)


Only the constant sized part of the structure.


Actually for minimum size we'd also need a guarantee that `alloc_buf_more` 
returns a valid allocated object.


Why? Please explain a little bit here.


So `alloc_buf_more` could return NULL, a valid pointer or an invalid 
pointer.  So, we could end up returning a non-zero minimum size for an 
invalid or NULL pointer, which is incorrect, we don't know that.


We won't need the object validity guarantee for (2) beyond, e.g. 
guarding against a new NULL pointer dereference because it's a *maximum* 
estimate; an invalid or NULL pointer would have 0 size.  So for such 
cases, __bos(q, 0) could return


sizeof(*q) + (q ? q->foo:0)

and __bos(q->array, 0) could be

sizeof(*q) + q->foo - offsetof(q, array)

There's no need to guard against a dereference in the second case 
because the q->array dereference already assumes that q is valid.




and

2. if you're able to determine the size of the flex array (through 
__element_count__(foo) for example), you could even determine the maximum size 
of the whole object.

For (2) though, you'd break applications that overallocate and then expect to 
be able to use that overallocation despite the space not being reflected in the 
__element_count__.  I think it's a bug in the application and I can't see a way 
for an application to be able to do this in a valid way so I'm inclined towards 
breaking it.


Currently, we allow the situation when the allocation size for the whole object 
is larger than the value reflected in the “counted_by” attribute (the old name 
is __element_count__). But don’t allow the other way around (i.e, when the 
allocation size for the whole object is smaller than the value reflected in the 
“counted_by” attribute.


Right, that's going to be the "break".  For underallocation __bos will 
only end up overestimating the space available, which is not ideal, but 
won't end up breaking compatibility.




Of course, the fact that gcc allows flex arrays to be in the middle of structs 
breaks the base assumption but that's something we need to get rid of anyway 
since there's no way for valid C programs to use that safely.


Since GCC14, we started to deprecate this extension (allow flex array to be in 
the middle of structs).
https://gcc.gnu.org/pipermail/gcc-cvs/2023-June/385730.html


Yes, that's what I'm banking on.

Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Qing Zhao via Gcc-patches


> On Aug 4, 2023, at 10:42 AM, Siddhesh Poyarekar  wrote:
> 
> On 2023-08-04 10:40, Siddhesh Poyarekar wrote:
>> On 2023-08-03 13:34, Qing Zhao wrote:
>>> One thing I need to point out first is, currently, even for regular fixed 
>>> size array in the structure,
>>> We have this same issue, for example:
>>> 
>>> #define LENGTH 10
>>> 
>>> struct fix {
>>>size_t foo;
>>>int array[LENGTH];
>>> };
>>> 
>>> …
>>> int main ()
>>> {
>>>struct fix *p;
>>>p = alloc_buf_more ();
>>> 
>>>expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
>>>expect(__builtin_object_size(p->array, 0), -1);
>>> }
>>> 
>>> Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for 
>>> it.
>>> This is not a special issue for flexible array member.
>> That's fine for fixed arrays at the end of a struct because the "whole 
>> object" size could be anything; `p` could be pointing to the beginning of an 
>> array for all we know.  If however `array` is strictly a flex array, i.e.:
>> ```
>> struct A
>> {
>>   size_t foo;
>>   int array[];
>> };
>> ```
>> then there's no way in valid C to have an array of `struct fix`, so `q` must 
>> be pointing to a single element.  So you could deduce:
>> 1. the minimum size of the whole object that q points to.
> 
> Actually for minimum size we'd also need a guarantee that `alloc_buf_more` 
> returns a valid allocated object.

Why? Please explain a little bit here.

thanks.

Qing
> 
> Sid



Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Qing Zhao via Gcc-patches


> On Aug 4, 2023, at 10:40 AM, Siddhesh Poyarekar  wrote:
> 
> On 2023-08-03 13:34, Qing Zhao wrote:
>> One thing I need to point out first is, currently, even for regular fixed 
>> size array in the structure,
>> We have this same issue, for example:
>> #define LENGTH 10
>> struct fix {
>>   size_t foo;
>>   int array[LENGTH];
>> };
>> …
>> int main ()
>> {
>>   struct fix *p;
>>   p = alloc_buf_more ();
>>   expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
>>   expect(__builtin_object_size(p->array, 0), -1);
>> }
>> Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for 
>> it.
>> This is not a special issue for flexible array member.
> 
> That's fine for fixed arrays at the end of a struct because the "whole 
> object" size could be anything; `p` could be pointing to the beginning of an 
> array for all we know.  If however `array` is strictly a flex array, i.e.:
> 
> ```
> struct A
> {
>  size_t foo;
>  int array[];
> };
> ```
> 
> then there's no way in valid C to have an array of `struct fix`,

Yes!!   this is exactly the place that makes difference between structures with 
fixed arrays and the ones with flexible arrays. 

With such difference, I guess that using the type of the structure with 
flexible array member for p->array to get the size of the whole object p point 
to might be reasonable? 

> so `q` must be pointing to a single element.  So you could deduce:
> 
> 1. the minimum size of the whole object that q points to.

You mean that the TYPE will determine the minimum size of the whole object?  
(Does this include the size of the flexible array member, or only the other 
part of the structure except the flexible array member?)

> 
> and
> 
> 2. if you're able to determine the size of the flex array (through 
> __element_count__(foo) for example), you could even determine the maximum 
> size of the whole object.
> 
> For (2) though, you'd break applications that overallocate and then expect to 
> be able to use that overallocation despite the space not being reflected in 
> the __element_count__.  I think it's a bug in the application and I can't see 
> a way for an application to be able to do this in a valid way so I'm inclined 
> towards breaking it.

Currently, we allow the situation when the allocation size for the whole object 
is larger than the value reflected in the “counted_by” attribute (the old name 
is __element_count__). But don’t allow the other way around (i.e, when the 
allocation size for the whole object is smaller than the value reflected in the 
“counted_by” attribute. 
> 
> Of course, the fact that gcc allows flex arrays to be in the middle of 
> structs breaks the base assumption but that's something we need to get rid of 
> anyway since there's no way for valid C programs to use that safely.

Since GCC14, we started to deprecate this extension (allow flex array to be in 
the middle of structs).
https://gcc.gnu.org/pipermail/gcc-cvs/2023-June/385730.html

Thanks.

Qing


> 
> Thanks,
> Sid



Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Siddhesh Poyarekar

On 2023-08-04 10:40, Siddhesh Poyarekar wrote:

On 2023-08-03 13:34, Qing Zhao wrote:
One thing I need to point out first is, currently, even for regular 
fixed size array in the structure,

We have this same issue, for example:

#define LENGTH 10

struct fix {
   size_t foo;
   int array[LENGTH];
};

…
int main ()
{
   struct fix *p;
   p = alloc_buf_more ();

   expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
   expect(__builtin_object_size(p->array, 0), -1);
}

Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN 
for it.

This is not a special issue for flexible array member.


That's fine for fixed arrays at the end of a struct because the "whole 
object" size could be anything; `p` could be pointing to the beginning 
of an array for all we know.  If however `array` is strictly a flex 
array, i.e.:


```
struct A
{
   size_t foo;
   int array[];
};
```

then there's no way in valid C to have an array of `struct fix`, so `q` 
must be pointing to a single element.  So you could deduce:


1. the minimum size of the whole object that q points to.


Actually for minimum size we'd also need a guarantee that 
`alloc_buf_more` returns a valid allocated object.


Sid


Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Siddhesh Poyarekar

On 2023-08-03 13:34, Qing Zhao wrote:

One thing I need to point out first is, currently, even for regular fixed size 
array in the structure,
We have this same issue, for example:

#define LENGTH 10

struct fix {
   size_t foo;
   int array[LENGTH];
};

…
int main ()
{
   struct fix *p;
   p = alloc_buf_more ();

   expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
   expect(__builtin_object_size(p->array, 0), -1);
}

Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for it.
This is not a special issue for flexible array member.


That's fine for fixed arrays at the end of a struct because the "whole 
object" size could be anything; `p` could be pointing to the beginning 
of an array for all we know.  If however `array` is strictly a flex 
array, i.e.:


```
struct A
{
  size_t foo;
  int array[];
};
```

then there's no way in valid C to have an array of `struct fix`, so `q` 
must be pointing to a single element.  So you could deduce:


1. the minimum size of the whole object that q points to.

and

2. if you're able to determine the size of the flex array (through 
__element_count__(foo) for example), you could even determine the 
maximum size of the whole object.


For (2) though, you'd break applications that overallocate and then 
expect to be able to use that overallocation despite the space not being 
reflected in the __element_count__.  I think it's a bug in the 
application and I can't see a way for an application to be able to do 
this in a valid way so I'm inclined towards breaking it.


Of course, the fact that gcc allows flex arrays to be in the middle of 
structs breaks the base assumption but that's something we need to get 
rid of anyway since there's no way for valid C programs to use that safely.


Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Qing Zhao via Gcc-patches


> On Aug 4, 2023, at 3:38 AM, Kees Cook  wrote:
> 
> On Thu, Aug 03, 2023 at 09:31:24PM +, Qing Zhao wrote:
>> So, the basic question is:
>> 
>> Given the following:
>> 
>> struct fix {
>>  int others;
>>  int array[10];
>> }
>> 
>> extern struct fix * alloc_buf ();
>> 
>> int main ()
>> {
>>  struct fix *p = alloc_buf ();
>>  __builtin_object_size(p->array,0) == ?
>> }
>> 
>> Given p->array, can the compiler determine that p points to an object that 
>> has TYPE struct fix?
>> 
>> If the answer is YES, then the current__builtin_object_size algorithm can be 
>> improved to determine __builtin_object_size(p->array, 0)  with the TYPE of 
>> the struct fix.
> 
> I think it is fine to leave __bos(..., 0) as-is. From the Linux kernel's
> use of __bos, we are almost exclusively only interesting the mode 1, not
> node 0. :)

Okay, that’s good to know.

Qing
> 
> -- 
> Kees Cook



Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Kees Cook via Gcc-patches
On Thu, Aug 03, 2023 at 09:31:24PM +, Qing Zhao wrote:
> So, the basic question is:
> 
> Given the following:
> 
> struct fix {
>   int others;
>   int array[10];
> }
> 
> extern struct fix * alloc_buf ();
> 
> int main ()
> {
>   struct fix *p = alloc_buf ();
>   __builtin_object_size(p->array,0) == ?
> }
> 
> Given p->array, can the compiler determine that p points to an object that 
> has TYPE struct fix?
> 
> If the answer is YES, then the current__builtin_object_size algorithm can be 
> improved to determine __builtin_object_size(p->array, 0)  with the TYPE of 
> the struct fix.

I think it is fine to leave __bos(..., 0) as-is. From the Linux kernel's
use of __bos, we are almost exclusively only interesting the mode 1, not
node 0. :)

-- 
Kees Cook


Re: One question on the source code of tree-object-size.cc

2023-08-04 Thread Kees Cook via Gcc-patches
On Thu, Aug 03, 2023 at 07:55:54PM +, Qing Zhao wrote:
> 
> 
> > On Aug 3, 2023, at 1:51 PM, Kees Cook  wrote:
> > 
> > On August 3, 2023 10:34:24 AM PDT, Qing Zhao  wrote:
> >> One thing I need to point out first is, currently, even for regular fixed 
> >> size array in the structure,
> >> We have this same issue, for example:
> >> 
> >> #define LENGTH 10
> >> 
> >> struct fix {
> >> size_t foo;
> >> int array[LENGTH];
> >> };
> >> 
> >> …
> >> int main ()
> >> {
> >> struct fix *p;
> >> p = alloc_buf_more ();
> >> 
> >> expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
> >> expect(__builtin_object_size(p->array, 0), -1);
> >> }
> >> 
> >> Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for 
> >> it.
> >> This is not a special issue for flexible array member.
> > 
> > Is this true with -fstrict-flex-arrays=3 ?
> 
> Yes. 

Okay, right, I understand now -- it doesn't see the allocation, therefore
max size is unknown. Sounds good.

-Kees

-- 
Kees Cook


Re: One question on the source code of tree-object-size.cc

2023-08-03 Thread Qing Zhao via Gcc-patches
So, the basic question is:

Given the following:

struct fix {
  int others;
  int array[10];
}

extern struct fix * alloc_buf ();

int main ()
{
  struct fix *p = alloc_buf ();
  __builtin_object_size(p->array,0) == ?
}

Given p->array, can the compiler determine that p points to an object that has 
TYPE struct fix?

If the answer is YES, then the current__builtin_object_size algorithm can be 
improved to determine __builtin_object_size(p->array, 0)  with the TYPE of the 
struct fix.

Qing


> On Aug 3, 2023, at 1:34 PM, Qing Zhao via Gcc-patches 
>  wrote:
> 
> One thing I need to point out first is, currently, even for regular fixed 
> size array in the structure,
> We have this same issue, for example:
> 
> #define LENGTH 10
> 
> struct fix {
>  size_t foo;
>  int array[LENGTH];
> };
> 
> …
> int main ()
> {
>  struct fix *p;
>  p = alloc_buf_more ();
> 
>  expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
>  expect(__builtin_object_size(p->array, 0), -1);
> }
> 
> Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for it.
> This is not a special issue for flexible array member.
> 
> Qing
> 
> 
> On Aug 3, 2023, at 1:19 PM, Siddhesh Poyarekar  wrote:
>> 
>> On 2023-08-03 12:43, Qing Zhao wrote:
 Surely we could emit that for __bdos(q->array, 0) though, couldn't we?
>>> For __bdos(q->array, 0), we only have the access info for the sub-object 
>>> q->array, we can surely decide the size of the sub-object q->array, but we 
>>> still cannot
>>> decide the whole object that is pointed by q (the same reason as above), 
>>> right?
>> 
>> It's tricky, I mean we could assume p to be a valid object due to the 
>> dereference and hence assume that q->foo is also valid and that there's at 
>> least sizeof(*q) + q->foo * sizeof (q->array) bytes available.  The question 
>> then is whether q could be pointing to an element of an array of `struct 
>> annotated`.  Could we ever have a valid array of such structs that have a 
>> flex array at the end?  Wouldn't it always be a single object?
>> 
>> In fact for all pointers to such structs with a flex array at the end, could 
>> we always assume that it is a single object and never part of an array, and 
>> hence return sizeof()?
>> 
>> Thanks,
>> Sid
> 



Re: One question on the source code of tree-object-size.cc

2023-08-03 Thread Qing Zhao via Gcc-patches


> On Aug 3, 2023, at 1:51 PM, Kees Cook  wrote:
> 
> On August 3, 2023 10:34:24 AM PDT, Qing Zhao  wrote:
>> One thing I need to point out first is, currently, even for regular fixed 
>> size array in the structure,
>> We have this same issue, for example:
>> 
>> #define LENGTH 10
>> 
>> struct fix {
>> size_t foo;
>> int array[LENGTH];
>> };
>> 
>> …
>> int main ()
>> {
>> struct fix *p;
>> p = alloc_buf_more ();
>> 
>> expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
>> expect(__builtin_object_size(p->array, 0), -1);
>> }
>> 
>> Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for 
>> it.
>> This is not a special issue for flexible array member.
> 
> Is this true with -fstrict-flex-arrays=3 ?

Yes. 


Please see the following testing case:
#include 
#include 

#define LENGTH 10
#define SIZE_BUMP 5 
#define noinline __attribute__((__noinline__))

struct fix {
  size_t foo;
  int array[LENGTH]; 
};

#define expect(p, _v) do { \
size_t v = _v; \
if (p == v) \
__builtin_printf ("ok:  %s == %zd\n", #p, p); \
else \
{  \
  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
} \
} while (0);


/* in the following function, malloc allocated more space than size of the 
   struct fix.  Then what's the correct behavior we expect
   the __builtin_object_size should have for the following?
 */

static struct fix * noinline alloc_buf_more ()
{
  struct fix *p;
  p = malloc(sizeof (struct fix) + SIZE_BUMP * sizeof (int)); 

  /*when checking the observed access p->array, we have info on both
observered allocation and observed access, 
A. from observed allocation (alloc_size): (LENGTH + SIZE_BUMP) * sizeof 
(int)
B. from observed access (TYPE): LENGTH * sizeof (int)
   */
   
  /* for MAXIMUM size in the whole object: currently, GCC always used the A.  */
  expect(__builtin_object_size(p->array, 0), (LENGTH + SIZE_BUMP) * 
sizeof(int));

  /* for MAXIMUM size in the sub-object: currently, GCC chose the smaller
 one among these two: B.  */
  expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));

  return p;
}

/* in the following function, malloc allocated less space than size of the 
   struct fix.  Then what's the correct behavior we expect
   the __builtin_object_size should have for the following?
 */

static struct fix * noinline alloc_buf_less ()
{
  struct fix *p;
  p = malloc(sizeof (struct fix) - SIZE_BUMP * sizeof (int)); 

  /*when checking the observed access p->array, we have info on both
observered allocation and observed access, 
A. from observed allocation (alloc_size): (LENGTH - SIZE_BUMP) * sizeof 
(int)
B. from observed access (TYPE): LENGTH * sizeof (int)
   */
   
  /* for MAXIMUM size in the whole object: currently, GCC always used the A.  */
  expect(__builtin_object_size(p->array, 0), (LENGTH - SIZE_BUMP) * 
sizeof(int));

  /* for MAXIMUM size in the sub-object: currently, GCC chose the smaller
 one among these two: B.  */
  expect(__builtin_object_size(p->array, 1), (LENGTH - SIZE_BUMP) * 
sizeof(int));

  return p;
}


int main ()
{
  struct fix *p, *q; 
  p = alloc_buf_more ();
  q = alloc_buf_less ();

  expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
  expect(__builtin_object_size(p->array, 0), -1);
  /*when checking the pointer p, we have no observed allocation nor observed 
access.
therefore, we cannot determine the size info here.  */
  expect(__builtin_object_size(p, 1), -1);
  expect(__builtin_object_size(p, 0), -1);

  /*when checking the observed access p->array, we only have info on the
observed access, i.e, the TYPE_SIZE info from the access. We don't have
info on the whole object.  */
  expect(__builtin_object_size(q->array, 1), LENGTH * sizeof(int));
  expect(__builtin_object_size(q->array, 0), -1);
  /*when checking the pointer p, we have no observed allocation nor observed 
access.
therefore, we cannot determine the size info here.  */
  expect(__builtin_object_size(q, 1), -1);
  expect(__builtin_object_size(q, 0), -1);

  return 0;
}

[opc@qinzhao-ol8u3-x86 108896]$ sh t
/home/opc/Install/latest/bin/gcc -O -fstrict-flex-arrays=3 t28.c
ok:  __builtin_object_size(p->array, 0) == 60
ok:  __builtin_object_size(p->array, 1) == 40
ok:  __builtin_object_size(p->array, 0) == 20
ok:  __builtin_object_size(p->array, 1) == 20
ok:  __builtin_object_size(p->array, 1) == 40
ok:  __builtin_object_size(p->array, 0) == -1
ok:  __builtin_object_size(p, 1) == -1
ok:  __builtin_object_size(p, 0) == -1
ok:  __builtin_object_size(q->array, 1) == 40
ok:  __builtin_object_size(q->array, 0) == -1
ok:  __builtin_object_size(q, 1) == -1
ok:  __builtin_object_size(q, 0) == -1
[opc@qinzhao-ol8u3-x86 108896]$ 

> 
> -Kees
> 
>> 
>> Qing
>> 
>> 
>> On Aug 3, 2023, at 1:19 PM, Siddhesh Poyarekar  wrote:
>>> 
>>> On 2023-08-03 12:43, Qing Zhao wrote:
> Surely we could emit that for __bdos(q->array, 0) though, couldn't we?
 For 

Re: One question on the source code of tree-object-size.cc

2023-08-03 Thread Kees Cook via Gcc-patches
On August 3, 2023 10:34:24 AM PDT, Qing Zhao  wrote:
>One thing I need to point out first is, currently, even for regular fixed size 
>array in the structure,
>We have this same issue, for example:
>
>#define LENGTH 10
>
>struct fix {
>  size_t foo;
>  int array[LENGTH];
>};
>
>…
>int main ()
>{
>  struct fix *p;
>  p = alloc_buf_more ();
>
>  expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
>  expect(__builtin_object_size(p->array, 0), -1);
>}
>
>Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for it.
>This is not a special issue for flexible array member.

Is this true with -fstrict-flex-arrays=3 ?

-Kees

>
>Qing
>
>
>On Aug 3, 2023, at 1:19 PM, Siddhesh Poyarekar  wrote:
>> 
>> On 2023-08-03 12:43, Qing Zhao wrote:
  Surely we could emit that for __bdos(q->array, 0) though, couldn't we?
>>> For __bdos(q->array, 0), we only have the access info for the sub-object 
>>> q->array, we can surely decide the size of the sub-object q->array, but we 
>>> still cannot
>>> decide the whole object that is pointed by q (the same reason as above), 
>>> right?
>> 
>> It's tricky, I mean we could assume p to be a valid object due to the 
>> dereference and hence assume that q->foo is also valid and that there's at 
>> least sizeof(*q) + q->foo * sizeof (q->array) bytes available.  The question 
>> then is whether q could be pointing to an element of an array of `struct 
>> annotated`.  Could we ever have a valid array of such structs that have a 
>> flex array at the end?  Wouldn't it always be a single object?
>> 
>> In fact for all pointers to such structs with a flex array at the end, could 
>> we always assume that it is a single object and never part of an array, and 
>> hence return sizeof()?
>> 
>> Thanks,
>> Sid
>


-- 
Kees Cook


Re: One question on the source code of tree-object-size.cc

2023-08-03 Thread Qing Zhao via Gcc-patches
One thing I need to point out first is, currently, even for regular fixed size 
array in the structure,
We have this same issue, for example:

#define LENGTH 10

struct fix {
  size_t foo;
  int array[LENGTH];
};

…
int main ()
{
  struct fix *p;
  p = alloc_buf_more ();

  expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));
  expect(__builtin_object_size(p->array, 0), -1);
}

Currently, for __builtin_object_size(p->array, 0),  GCC return UNKNOWN for it.
This is not a special issue for flexible array member.

Qing


On Aug 3, 2023, at 1:19 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-08-03 12:43, Qing Zhao wrote:
>>>  Surely we could emit that for __bdos(q->array, 0) though, couldn't we?
>> For __bdos(q->array, 0), we only have the access info for the sub-object 
>> q->array, we can surely decide the size of the sub-object q->array, but we 
>> still cannot
>> decide the whole object that is pointed by q (the same reason as above), 
>> right?
> 
> It's tricky, I mean we could assume p to be a valid object due to the 
> dereference and hence assume that q->foo is also valid and that there's at 
> least sizeof(*q) + q->foo * sizeof (q->array) bytes available.  The question 
> then is whether q could be pointing to an element of an array of `struct 
> annotated`.  Could we ever have a valid array of such structs that have a 
> flex array at the end?  Wouldn't it always be a single object?
> 
> In fact for all pointers to such structs with a flex array at the end, could 
> we always assume that it is a single object and never part of an array, and 
> hence return sizeof()?
> 
> Thanks,
> Sid



Re: One question on the source code of tree-object-size.cc

2023-08-03 Thread Siddhesh Poyarekar

On 2023-08-03 12:43, Qing Zhao wrote:

  Surely we could emit that for __bdos(q->array, 0) though, couldn't we?


For __bdos(q->array, 0), we only have the access info for the sub-object q->array, 
we can surely decide the size of the sub-object q->array, but we still cannot
decide the whole object that is pointed by q (the same reason as above), right?


It's tricky, I mean we could assume p to be a valid object due to the 
dereference and hence assume that q->foo is also valid and that there's 
at least sizeof(*q) + q->foo * sizeof (q->array) bytes available.  The 
question then is whether q could be pointing to an element of an array 
of `struct annotated`.  Could we ever have a valid array of such structs 
that have a flex array at the end?  Wouldn't it always be a single object?


In fact for all pointers to such structs with a flex array at the end, 
could we always assume that it is a single object and never part of an 
array, and hence return sizeof()?


Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-08-03 Thread Qing Zhao via Gcc-patches



> On Aug 3, 2023, at 12:15 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-08-02 10:02, Qing Zhao wrote:
>>   /*when checking the observed access p->array, we only have info on the
>> observed access, i.e, the TYPE_SIZE info from the access. We don't have
>> info on the whole object.  */
>>   expect(__builtin_dynamic_object_size(q->array, 1), q->foo * sizeof(int));
>>   expect(__builtin_dynamic_object_size(q->array, 0), -1);
>>   expect(__builtin_dynamic_object_size(q->array, 3), q->foo * sizeof(int));
>>   expect(__builtin_dynamic_object_size(q->array, 2), 0);
>>   /*when checking the pointer p, we have no observed allocation nor observed 
>> access.
>> therefore, we cannot determine the size info here.  */
>>   expect(__builtin_dynamic_object_size(q, 1), -1);
>>   expect(__builtin_dynamic_object_size(q, 0), -1);
>>   expect(__builtin_dynamic_object_size(q, 3), 0);
>>   expect(__builtin_dynamic_object_size(q, 2), 0);
> 
> I'm wondering if we could sizeof (*q) + q->foo for __bdos(q, 0), but I 
> suppose it could mean generating code that potentially dereferences an 
> invalid pointer.

I think for __bdos(q, 0), if there is no allocation information for q, we 
cannot decide the size for its pointee.

>  Surely we could emit that for __bdos(q->array, 0) though, couldn't we?

For __bdos(q->array, 0), we only have the access info for the sub-object 
q->array, we can surely decide the size of the sub-object q->array, but we 
still cannot
decide the whole object that is pointed by q (the same reason as above), right?

Qing
> 
> Thanks,
> Sid



Re: One question on the source code of tree-object-size.cc

2023-08-03 Thread Siddhesh Poyarekar

On 2023-08-02 10:02, Qing Zhao wrote:

   /*when checking the observed access p->array, we only have info on the
 observed access, i.e, the TYPE_SIZE info from the access. We don't have
 info on the whole object.  */
   expect(__builtin_dynamic_object_size(q->array, 1), q->foo * sizeof(int));
   expect(__builtin_dynamic_object_size(q->array, 0), -1);
   expect(__builtin_dynamic_object_size(q->array, 3), q->foo * sizeof(int));
   expect(__builtin_dynamic_object_size(q->array, 2), 0);
   /*when checking the pointer p, we have no observed allocation nor observed 
access.
 therefore, we cannot determine the size info here.  */
   expect(__builtin_dynamic_object_size(q, 1), -1);
   expect(__builtin_dynamic_object_size(q, 0), -1);
   expect(__builtin_dynamic_object_size(q, 3), 0);
   expect(__builtin_dynamic_object_size(q, 2), 0);


I'm wondering if we could sizeof (*q) + q->foo for __bdos(q, 0), but I 
suppose it could mean generating code that potentially dereferences an 
invalid pointer.  Surely we could emit that for __bdos(q->array, 0) 
though, couldn't we?


Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-08-02 Thread Qing Zhao via Gcc-patches
Okay.  This previous small example was used to show the correct behavior of 
__bos 
for Fixed arrays when the allocation size and the TYPE_SIZE are mismatched. 

Now we agreed on the correct behavior for each of the cases for the fixed array.

Since the new “counted_by” attribute is mainly a complement to the TYPE_SIZE 
for the flexible array member.
So, GCC should just use it similarly as TYPE_SIZE. 

Based on the fixed array example, I came up a small example for the flexible 
array member with “counted_by” attribute,
And the expected correct behavior for each of the cases. 
I also put detailed comments into the example to explain why for each case. 
(Similar as the fixed array example)

Please take a look at this example and let me know any issue you see.
With my private GCC that support “counted_by” attribute, all the cases passed.

Thanks.

Qing.


#include 
#include 

struct annotated {
size_t foo;
int array[] __attribute__((counted_by (foo)));
};

#define expect(p, _v) do { \
size_t v = _v; \
if (p == v) \
__builtin_printf ("ok:  %s == %zd\n", #p, p); \
else \
{  \
  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
} \
} while (0);

#define noinline __attribute__((__noinline__))
#define SIZE_BUMP 5 

/* In general, Due to type casting, the type for the pointee of a pointer
   does not say anything about the object it points to,
   So, __builtin_object_size can not directly use the type of the pointee
   to decide the size of the object the pointer points to.

   there are only two reliable ways:
   A. observed allocations  (call to the allocation functions in the routine)
   B. observed accesses (read or write access to the location of the 
 pointer points to)

   that provide information about the type/existence of an object at
   the corresponding address.

   for A, we use the "alloc_size" attribute for the corresponding allocation
   functions to determine the object size;

   For B, we use the SIZE info of the TYPE attached to the corresponding access.
   (We treat counted_by attribute as a complement to the SIZE info of the TYPE
for FMA) 

   The only other way in C which ensures that a pointer actually points
   to an object of the correct type is 'static':

   void foo(struct P *p[static 1]);   

   See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html
   for more details.  */

/* in the following function, malloc allocated more space than the value
   of counted_by attribute.  Then what's the correct behavior we expect
   the __builtin_dynamic_object_size should have for each of the cases?  */ 

static struct annotated * noinline alloc_buf_more (int index)
{
  struct annotated *p;
  p = malloc(sizeof (*p) + (index + SIZE_BUMP) * sizeof (int));
  p->foo = index;

  /*when checking the observed access p->array, we have info on both
observered allocation and observed access, 
A. from observed allocation: (index + SIZE_BUMP) * sizeof (int)
B. from observed access: p->foo * sizeof (int)

in the above, p->foo = index.
   */
   
  /* for size in the whole object: always uses A.  */
  /* for size in the sub-object: chose the smaller of A and B.
   * Please see https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625891.html
   * for details on why.  */

  /* for MAXIMUM size in the whole object: use the allocation size 
 for the whole object.  */
  expect(__builtin_dynamic_object_size(p->array, 0), (index + SIZE_BUMP) * 
sizeof(int));

  /* for MAXIMUM size in the sub-object. use the smaller of A and B.  */ 
  expect(__builtin_dynamic_object_size(p->array, 1), (p->foo) * sizeof(int));

  /* for MINIMUM size in the whole object: use the allocation size 
 for the whole object.  */
  expect(__builtin_dynamic_object_size(p->array, 2), (index + SIZE_BUMP) * 
sizeof(int));

  /* for MINIMUM size in the sub-object: use the smaller of A and B.  */
  expect(__builtin_dynamic_object_size(p->array, 3), p->foo * sizeof(int));

  /*when checking the pointer p, we only have info on the observed allocation.
So, the object size info can only been obtained from the call to malloc.
for both MAXIMUM and MINIMUM: A = (index + SIZE_BUMP) * sizeof (int)  */ 
  expect(__builtin_dynamic_object_size(p, 1), sizeof (*p) + (index + SIZE_BUMP) 
* sizeof(int));
  expect(__builtin_dynamic_object_size(p, 0), sizeof (*p) + (index + SIZE_BUMP) 
* sizeof(int));
  expect(__builtin_dynamic_object_size(p, 3), sizeof (*p) + (index + SIZE_BUMP) 
* sizeof(int));
  expect(__builtin_dynamic_object_size(p, 2), sizeof (*p) + (index + SIZE_BUMP) 
* sizeof(int));
  return p;
}

/* in the following function, malloc allocated less space than the value
   of counted_by attribute.  Then what's the correct behavior we expect
   the __builtin_dynamic_object_size should have for each of the cases?
   NOTE: this is an user error, GCC should 

Re: One question on the source code of tree-object-size.cc

2023-08-01 Thread Siddhesh Poyarekar

On 2023-08-01 18:57, Kees Cook wrote:


   return p;
}

/* in the following function, malloc allocated less space than size of the
struct fix.  Then what's the correct behavior we expect
the __builtin_object_size should have for the following?
  */

static struct fix * noinline alloc_buf_less ()
{
   struct fix *p;
   p = malloc(sizeof (struct fix) - SIZE_BUMP * sizeof (int));

   /*when checking the observed access p->array, we have info on both
 observered allocation and observed access,
 A. from observed allocation (alloc_size): (LENGTH - SIZE_BUMP) * sizeof 
(int)
 B. from observed access (TYPE): LENGTH * sizeof (int)
*/

   /* for MAXIMUM size in the whole object: currently, GCC always used the A.  */

   expect(__builtin_object_size(p->array, 0), (LENGTH - SIZE_BUMP) * 
sizeof(int));


ok:  __builtin_object_size(p->array, 0) == 20

My brain just melted a little, as this is now an under-sized instance of
"p", so we have an incomplete allocation. (I would expect -Warray-bounds
to yell very loudly for this.) But, technically, yes, this looks like
the right calculation.


AFAIK, -Warray-bounds will only yell in case of a dereference that the 
compiler may potentially see as being beyond that 20 byte bound; it 
won't actually see the undersized allocation.  An analyzer warning would 
be useful for just the undersized allocation regardless of whether the 
code actually ends up accessing the object beyond the allocation bounds.


Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-08-01 Thread Siddhesh Poyarekar

On 2023-08-01 17:35, Qing Zhao wrote:

typedef struct
{
   int a;
} A;
size_t f()
{
   A *p = malloc (1);
   return __builtin_object_size (p, 0);


Correction, that should be __builtin_object_size (p->a, 0).


Actually, it should be __builtin_object_size(p->a, 1).
For __builtin_object_size(p->a,0), gcc always uses the allocation size for the 
whole object.


Right, sorry, I mistyped, twice in fact; it should have been 
__bos(>a, 1) :)




GCC’s current behavior is:

For the size of the whole object, GCC currently always uses the allocation size.
And for the size in the sub-object, GCC chose the smaller one among the 
allocation size and the TYPE_SIZE.

Is this correct behavior?


Yes, it's deliberate; it specifically checks on var != pt_var, which can 
only be true for subobjects.


Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-08-01 Thread Kees Cook via Gcc-patches
On Tue, Aug 01, 2023 at 09:35:30PM +, Qing Zhao wrote:
> 
> 
> > On Jul 31, 2023, at 1:07 PM, Siddhesh Poyarekar  wrote:
> > 
> > On 2023-07-31 13:03, Siddhesh Poyarekar wrote:
> >> On 2023-07-31 12:47, Qing Zhao wrote:
> >>> Hi, Sid and Jakub,
> >>> 
> >>> I have a question in the following source portion of the routine 
> >>> “addr_object_size” of gcc/tree-object-size.cc:
> >>> 
> >>>   743   bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
> >>>   744   if (bytes != error_mark_node)
> >>>   745 {
> >>>   746   bytes = size_for_offset (var_size, bytes);
> >>>   747   if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == 
> >>> MEM_REF)
> >>>   748 {
> >>>   749   tree bytes2 = compute_object_offset (TREE_OPERAND 
> >>> (ptr, 0),
> >>>   750pt_var);
> >>>   751   if (bytes2 != error_mark_node)
> >>>   752 {
> >>>   753   bytes2 = size_for_offset (pt_var_size, bytes2);
> >>>   754   bytes = size_binop (MIN_EXPR, bytes, bytes2);
> >>>   755 }
> >>>   756 }
> >>>   757 }
> >>> 
> >>> At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM 
> >>> or not?
> >>> Shall we use
> >>> 
> >>> (object_size_type & OST_MINIMUM
> >>>  ? MIN_EXPR : MAX_EXPR)
> >>> 
> >> That MIN_EXPR is not for OST_MINIMUM.  It is to cater for allocations like 
> >> this:
> >> typedef struct
> >> {
> >>   int a;
> >> } A;
> >> size_t f()
> >> {
> >>   A *p = malloc (1);
> >>   return __builtin_object_size (p, 0);
> > 
> > Correction, that should be __builtin_object_size (p->a, 0).
> 
> Actually, it should be __builtin_object_size(p->a, 1).
> For __builtin_object_size(p->a,0), gcc always uses the allocation size for 
> the whole object.
> 
> GCC’s current behavior is:
> 
> For the size of the whole object, GCC currently always uses the allocation 
> size. 
> And for the size in the sub-object, GCC chose the smaller one among the 
> allocation size and the TYPE_SIZE. 
> 
> Is this correct behavior?
> 
> thanks.
> 
> Qing
> 
> Please see the following small example to show the above behavior:
> 
> =
> 
> #include 
> #include 
> 
> #define LENGTH 10
> #define SIZE_BUMP 5 
> #define noinline __attribute__((__noinline__))
> 
> struct fix {
>   size_t foo;
>   int array[LENGTH]; 
> };
> 
> #define expect(p, _v) do { \
> size_t v = _v; \
> if (p == v) \
> __builtin_printf ("ok:  %s == %zd\n", #p, p); \
> else \
> {  \
>   __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
> } \
> } while (0);
> 
> 
> /* in the following function, malloc allocated more space than size of the 
>struct fix.  Then what's the correct behavior we expect
>the __builtin_object_size should have for the following?
>  */
> 
> static struct fix * noinline alloc_buf_more ()
> {
>   struct fix *p;
>   p = malloc(sizeof (struct fix) + SIZE_BUMP * sizeof (int)); 
> 
>   /*when checking the observed access p->array, we have info on both
> observered allocation and observed access, 
> A. from observed allocation (alloc_size): (LENGTH + SIZE_BUMP) * sizeof 
> (int)
> B. from observed access (TYPE): LENGTH * sizeof (int)
>*/
>
>   /* for MAXIMUM size in the whole object: currently, GCC always used the A.  
> */
>   expect(__builtin_object_size(p->array, 0), (LENGTH + SIZE_BUMP) * 
> sizeof(int));

ok:  __builtin_object_size(p->array, 0) == 60

This is what I'd expect, yes: all memory from "array" to end of
allocation, and that matches here: (LENGTH + SIZE_BUMP) * sizeof(int)

> 
>   /* for MAXIMUM size in the sub-object: currently, GCC chose the smaller
>  one among these two: B.  */
>   expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));

ok:  __builtin_object_size(p->array, 1) == 40

Also as I'd expect: just LENGTH * sizeof(int), the remaining bytes
starting at "array", based on type info, regardless of rest of allocation.

> 
>   return p;
> }
> 
> /* in the following function, malloc allocated less space than size of the 
>struct fix.  Then what's the correct behavior we expect
>the __builtin_object_size should have for the following?
>  */
> 
> static struct fix * noinline alloc_buf_less ()
> {
>   struct fix *p;
>   p = malloc(sizeof (struct fix) - SIZE_BUMP * sizeof (int)); 
> 
>   /*when checking the observed access p->array, we have info on both
> observered allocation and observed access, 
> A. from observed allocation (alloc_size): (LENGTH - SIZE_BUMP) * sizeof 
> (int)
> B. from observed access (TYPE): LENGTH * sizeof (int)
>*/
>
>   /* for MAXIMUM size in the whole object: currently, GCC always used the A.  
> */
>   expect(__builtin_object_size(p->array, 0), (LENGTH - SIZE_BUMP) * 
> sizeof(int));

ok:  __builtin_object_size(p->array, 0) == 20

My brain just melted a 

Re: One question on the source code of tree-object-size.cc

2023-08-01 Thread Qing Zhao via Gcc-patches


> On Jul 31, 2023, at 1:07 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-07-31 13:03, Siddhesh Poyarekar wrote:
>> On 2023-07-31 12:47, Qing Zhao wrote:
>>> Hi, Sid and Jakub,
>>> 
>>> I have a question in the following source portion of the routine 
>>> “addr_object_size” of gcc/tree-object-size.cc:
>>> 
>>>   743   bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
>>>   744   if (bytes != error_mark_node)
>>>   745 {
>>>   746   bytes = size_for_offset (var_size, bytes);
>>>   747   if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == 
>>> MEM_REF)
>>>   748 {
>>>   749   tree bytes2 = compute_object_offset (TREE_OPERAND (ptr, 
>>> 0),
>>>   750pt_var);
>>>   751   if (bytes2 != error_mark_node)
>>>   752 {
>>>   753   bytes2 = size_for_offset (pt_var_size, bytes2);
>>>   754   bytes = size_binop (MIN_EXPR, bytes, bytes2);
>>>   755 }
>>>   756 }
>>>   757 }
>>> 
>>> At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM or 
>>> not?
>>> Shall we use
>>> 
>>> (object_size_type & OST_MINIMUM
>>>  ? MIN_EXPR : MAX_EXPR)
>>> 
>> That MIN_EXPR is not for OST_MINIMUM.  It is to cater for allocations like 
>> this:
>> typedef struct
>> {
>>   int a;
>> } A;
>> size_t f()
>> {
>>   A *p = malloc (1);
>>   return __builtin_object_size (p, 0);
> 
> Correction, that should be __builtin_object_size (p->a, 0).

Actually, it should be __builtin_object_size(p->a, 1).
For __builtin_object_size(p->a,0), gcc always uses the allocation size for the 
whole object.

GCC’s current behavior is:

For the size of the whole object, GCC currently always uses the allocation 
size. 
And for the size in the sub-object, GCC chose the smaller one among the 
allocation size and the TYPE_SIZE. 

Is this correct behavior?

thanks.

Qing

Please see the following small example to show the above behavior:

=

#include 
#include 

#define LENGTH 10
#define SIZE_BUMP 5 
#define noinline __attribute__((__noinline__))

struct fix {
  size_t foo;
  int array[LENGTH]; 
};

#define expect(p, _v) do { \
size_t v = _v; \
if (p == v) \
__builtin_printf ("ok:  %s == %zd\n", #p, p); \
else \
{  \
  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
} \
} while (0);


/* in the following function, malloc allocated more space than size of the 
   struct fix.  Then what's the correct behavior we expect
   the __builtin_object_size should have for the following?
 */

static struct fix * noinline alloc_buf_more ()
{
  struct fix *p;
  p = malloc(sizeof (struct fix) + SIZE_BUMP * sizeof (int)); 

  /*when checking the observed access p->array, we have info on both
observered allocation and observed access, 
A. from observed allocation (alloc_size): (LENGTH + SIZE_BUMP) * sizeof 
(int)
B. from observed access (TYPE): LENGTH * sizeof (int)
   */
   
  /* for MAXIMUM size in the whole object: currently, GCC always used the A.  */
  expect(__builtin_object_size(p->array, 0), (LENGTH + SIZE_BUMP) * 
sizeof(int));

  /* for MAXIMUM size in the sub-object: currently, GCC chose the smaller
 one among these two: B.  */
  expect(__builtin_object_size(p->array, 1), LENGTH * sizeof(int));

  return p;
}

/* in the following function, malloc allocated less space than size of the 
   struct fix.  Then what's the correct behavior we expect
   the __builtin_object_size should have for the following?
 */

static struct fix * noinline alloc_buf_less ()
{
  struct fix *p;
  p = malloc(sizeof (struct fix) - SIZE_BUMP * sizeof (int)); 

  /*when checking the observed access p->array, we have info on both
observered allocation and observed access, 
A. from observed allocation (alloc_size): (LENGTH - SIZE_BUMP) * sizeof 
(int)
B. from observed access (TYPE): LENGTH * sizeof (int)
   */
   
  /* for MAXIMUM size in the whole object: currently, GCC always used the A.  */
  expect(__builtin_object_size(p->array, 0), (LENGTH - SIZE_BUMP) * 
sizeof(int));

  /* for MAXIMUM size in the sub-object: currently, GCC chose the smaller
 one among these two: B.  */
  expect(__builtin_object_size(p->array, 1), (LENGTH - SIZE_BUMP) * 
sizeof(int));

  return p;
}


int main ()
{
  struct fix *p, *q; 
  p = alloc_buf_more ();
  q = alloc_buf_less ();

  return 0;
}


When compile the above small testing case with upstream gcc with 
-fstrict-flex-array=1:

/home/opc/Install/latest/bin/gcc -O -fstrict-flex-arrays=1 t28.c
ok:  __builtin_object_size(p->array, 0) == 60
ok:  __builtin_object_size(p->array, 1) == 40
ok:  __builtin_object_size(p->array, 0) == 20
ok:  __builtin_object_size(p->array, 1) == 20


> 
>> }
>> where the returned size should be 1 and not sizeof (int).  The mode doesn't 
>> really matter in this case.
>> HTH.
>> Sid



Re: One question on the source code of tree-object-size.cc

2023-07-31 Thread Qing Zhao via Gcc-patches


> On Jul 31, 2023, at 2:23 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-07-31 14:13, Qing Zhao wrote:
>> Okay. I see.
>> Then if the size info from the TYPE is smaller than the size info from the 
>> malloc,
>>  then based on the current code, we use the smaller one between these two,
>>  i.e, the size info from the TYPE.  (Even for the OST_MAXIMUM).
>> Is such behavior correct?
> 
> Yes, it's correct even for OST_MAXIMUM.  The smaller one between the two is 
> the more precise estimate, which is why the mode doesn't matter.
> 
>> This is for the new “counted_by” attribute and how to use it in 
>> __builtin_dynamic_object_size.
>> for example:
>> ===
>> struct annotated {
>> size_t foo;
>> int array[] __attribute__((counted_by (foo)));
>> };
>> #define noinline __attribute__((__noinline__))
>> #define SIZE_BUMP 2
>> /* in the following function, malloc allocated more space than the value
>>of counted_by attribute.  Then what's the correct behavior we expect
>>the __builtin_dynamic_object_size should have?  */
>> static struct annotated * noinline alloc_buf (int index)
>> {
>>   struct annotated *p;
>>   p = malloc(sizeof (*p) + (index + SIZE_BUMP) * sizeof (int));
>>   p->foo = index;
>>   /*when checking the observed access p->array, we have info on both
>> observered allocation and observed access,
>> A. from observed allocation: (index + SIZE_BUMP) * sizeof (int)
>> B. from observed access: p->foo * sizeof (int)
>> in the above, p->foo = index.
>>*/
>>   /* for MAXIMUM size, based on the current code, we will use the size info 
>> from the TYPE,
>>  i.e, the “counted_by” attribute, which is the smaller one.   */
>>   expect(__builtin_dynamic_object_size(p->array, 1), (p->foo) * sizeof(int));
> 
> If the counted_by is less than what is allocated, it is the more correct 
> value to return because that's what the application asked for through the 
> attribute.  If the allocated size is less, we return the allocated size 
> because in that case, despite what the application said, the actual allocated 
> size is less and hence that's the safer value.

Thanks a lot for the clear explanation. This makes good sense.
> 
> In fact in the latter case it may even make sense to emit a warning because 
> it is more likely than not to be a bug.

Agreed. 

Here is a patch from Martin on a new similar warning (-Walloc-type):  
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625172.html. 

I guess that I will also need to issue warning for such cases for the new 
attribute “counted_by”.

Qing

> Thanks,
> Sid



Re: One question on the source code of tree-object-size.cc

2023-07-31 Thread Siddhesh Poyarekar

On 2023-07-31 14:13, Qing Zhao wrote:

Okay. I see.

Then if the size info from the TYPE is smaller than the size info from the 
malloc,
  then based on the current code, we use the smaller one between these two,
  i.e, the size info from the TYPE.  (Even for the OST_MAXIMUM).

Is such behavior correct?


Yes, it's correct even for OST_MAXIMUM.  The smaller one between the two 
is the more precise estimate, which is why the mode doesn't matter.




This is for the new “counted_by” attribute and how to use it in 
__builtin_dynamic_object_size.
for example:

===

struct annotated {
 size_t foo;
 int array[] __attribute__((counted_by (foo)));
};

#define noinline __attribute__((__noinline__))
#define SIZE_BUMP 2

/* in the following function, malloc allocated more space than the value
of counted_by attribute.  Then what's the correct behavior we expect
the __builtin_dynamic_object_size should have?  */

static struct annotated * noinline alloc_buf (int index)
{
   struct annotated *p;
   p = malloc(sizeof (*p) + (index + SIZE_BUMP) * sizeof (int));
   p->foo = index;

   /*when checking the observed access p->array, we have info on both
 observered allocation and observed access,
 A. from observed allocation: (index + SIZE_BUMP) * sizeof (int)
 B. from observed access: p->foo * sizeof (int)

 in the above, p->foo = index.
*/

   /* for MAXIMUM size, based on the current code, we will use the size info 
from the TYPE,
  i.e, the “counted_by” attribute, which is the smaller one.   */
   expect(__builtin_dynamic_object_size(p->array, 1), (p->foo) * sizeof(int));


If the counted_by is less than what is allocated, it is the more correct 
value to return because that's what the application asked for through 
the attribute.  If the allocated size is less, we return the allocated 
size because in that case, despite what the application said, the actual 
allocated size is less and hence that's the safer value.


In fact in the latter case it may even make sense to emit a warning 
because it is more likely than not to be a bug.


Thanks,
Sid


Re: One question on the source code of tree-object-size.cc

2023-07-31 Thread Qing Zhao via Gcc-patches
Hi, Sid,

Thanks a lot.

> On Jul 31, 2023, at 1:07 PM, Siddhesh Poyarekar  wrote:
> 
> On 2023-07-31 13:03, Siddhesh Poyarekar wrote:
>> On 2023-07-31 12:47, Qing Zhao wrote:
>>> Hi, Sid and Jakub,
>>> 
>>> I have a question in the following source portion of the routine 
>>> “addr_object_size” of gcc/tree-object-size.cc:
>>> 
>>>   743   bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
>>>   744   if (bytes != error_mark_node)
>>>   745 {
>>>   746   bytes = size_for_offset (var_size, bytes);
>>>   747   if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == 
>>> MEM_REF)
>>>   748 {
>>>   749   tree bytes2 = compute_object_offset (TREE_OPERAND (ptr, 
>>> 0),
>>>   750pt_var);
>>>   751   if (bytes2 != error_mark_node)
>>>   752 {
>>>   753   bytes2 = size_for_offset (pt_var_size, bytes2);
>>>   754   bytes = size_binop (MIN_EXPR, bytes, bytes2);
>>>   755 }
>>>   756 }
>>>   757 }
>>> 
>>> At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM or 
>>> not?
>>> Shall we use
>>> 
>>> (object_size_type & OST_MINIMUM
>>>  ? MIN_EXPR : MAX_EXPR)
>>> 
>> That MIN_EXPR is not for OST_MINIMUM.  It is to cater for allocations like 
>> this:
>> typedef struct
>> {
>>   int a;
>> } A;
>> size_t f()
>> {
>>   A *p = malloc (1);
>>   return __builtin_object_size (p, 0);
> 
> Correction, that should be __builtin_object_size (>a, 0)

Okay. I see.

Then if the size info from the TYPE is smaller than the size info from the 
malloc,
 then based on the current code, we use the smaller one between these two,
 i.e, the size info from the TYPE.  (Even for the OST_MAXIMUM). 

Is such behavior correct?

This is for the new “counted_by” attribute and how to use it in 
__builtin_dynamic_object_size. 
for example:

===

struct annotated {
size_t foo;
int array[] __attribute__((counted_by (foo)));
};

#define noinline __attribute__((__noinline__))
#define SIZE_BUMP 2 

/* in the following function, malloc allocated more space than the value
   of counted_by attribute.  Then what's the correct behavior we expect
   the __builtin_dynamic_object_size should have?  */

static struct annotated * noinline alloc_buf (int index)
{
  struct annotated *p;
  p = malloc(sizeof (*p) + (index + SIZE_BUMP) * sizeof (int));
  p->foo = index;

  /*when checking the observed access p->array, we have info on both
observered allocation and observed access, 
A. from observed allocation: (index + SIZE_BUMP) * sizeof (int)
B. from observed access: p->foo * sizeof (int)

in the above, p->foo = index.
   */

  /* for MAXIMUM size, based on the current code, we will use the size info 
from the TYPE, 
 i.e, the “counted_by” attribute, which is the smaller one.   */
  expect(__builtin_dynamic_object_size(p->array, 1), (p->foo) * sizeof(int));

  return p;
}


Is the above the correct behavior?

thanks.

Qing
> 
>> }
>> where the returned size should be 1 and not sizeof (int).  The mode doesn't 
>> really matter in this case.
>> HTH.
>> Sid



Re: One question on the source code of tree-object-size.cc

2023-07-31 Thread Siddhesh Poyarekar

On 2023-07-31 13:03, Siddhesh Poyarekar wrote:

On 2023-07-31 12:47, Qing Zhao wrote:

Hi, Sid and Jakub,

I have a question in the following source portion of the routine 
“addr_object_size” of gcc/tree-object-size.cc:


  743   bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
  744   if (bytes != error_mark_node)
  745 {
  746   bytes = size_for_offset (var_size, bytes);
  747   if (var != pt_var && pt_var_size && TREE_CODE (pt_var) 
== MEM_REF)

  748 {
  749   tree bytes2 = compute_object_offset (TREE_OPERAND 
(ptr, 0),

  750    pt_var);
  751   if (bytes2 != error_mark_node)
  752 {
  753   bytes2 = size_for_offset (pt_var_size, bytes2);
  754   bytes = size_binop (MIN_EXPR, bytes, bytes2);
  755 }
  756 }
  757 }

At line 754, why we always use “MIN_EXPR” whenever it’s for 
OST_MINIMUM or not?

Shall we use

(object_size_type & OST_MINIMUM
 ? MIN_EXPR : MAX_EXPR)



That MIN_EXPR is not for OST_MINIMUM.  It is to cater for allocations 
like this:


typedef struct
{
   int a;
} A;

size_t f()
{
   A *p = malloc (1);

   return __builtin_object_size (p, 0);


Correction, that should be __builtin_object_size (>a, 0)


}

where the returned size should be 1 and not sizeof (int).  The mode 
doesn't really matter in this case.


HTH.

Sid



Re: One question on the source code of tree-object-size.cc

2023-07-31 Thread Siddhesh Poyarekar

On 2023-07-31 12:47, Qing Zhao wrote:

Hi, Sid and Jakub,

I have a question in the following source portion of the routine 
“addr_object_size” of gcc/tree-object-size.cc:

  743   bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
  744   if (bytes != error_mark_node)
  745 {
  746   bytes = size_for_offset (var_size, bytes);
  747   if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == 
MEM_REF)
  748 {
  749   tree bytes2 = compute_object_offset (TREE_OPERAND (ptr, 0),
  750pt_var);
  751   if (bytes2 != error_mark_node)
  752 {
  753   bytes2 = size_for_offset (pt_var_size, bytes2);
  754   bytes = size_binop (MIN_EXPR, bytes, bytes2);
  755 }
  756 }
  757 }

At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM or not?
Shall we use

(object_size_type & OST_MINIMUM
 ? MIN_EXPR : MAX_EXPR)



That MIN_EXPR is not for OST_MINIMUM.  It is to cater for allocations 
like this:


typedef struct
{
  int a;
} A;

size_t f()
{
  A *p = malloc (1);

  return __builtin_object_size (p, 0);
}

where the returned size should be 1 and not sizeof (int).  The mode 
doesn't really matter in this case.


HTH.

Sid


One question on the source code of tree-object-size.cc

2023-07-31 Thread Qing Zhao via Gcc-patches
Hi, Sid and Jakub,

I have a question in the following source portion of the routine 
“addr_object_size” of gcc/tree-object-size.cc:

 743   bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
 744   if (bytes != error_mark_node)
 745 {
 746   bytes = size_for_offset (var_size, bytes);
 747   if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == 
MEM_REF)
 748 {
 749   tree bytes2 = compute_object_offset (TREE_OPERAND (ptr, 0),
 750pt_var);
 751   if (bytes2 != error_mark_node)
 752 {
 753   bytes2 = size_for_offset (pt_var_size, bytes2);
 754   bytes = size_binop (MIN_EXPR, bytes, bytes2);
 755 }
 756 }
 757 }

At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM or not? 
Shall we use 

(object_size_type & OST_MINIMUM
? MIN_EXPR : MAX_EXPR)

Instead?

Thanks a lot for the help.

Qing