> On Aug 17, 2023, at 3:59 PM, Siddhesh Poyarekar <siddh...@gotplt.org> wrote:
> 
> On 2023-08-17 15:27, Qing Zhao wrote:
>>> Yes, that's it.  Maybe it's more correct if instead of MAX_EXPR if for 
>>> OST_MINIMUM we stick with the early_objsz answer if it's non-zero.  I'm not 
>>> sure if that's the case for maximum size though, my gut says it isn't.
>> So, the major purpose for adding the early object size phase is for 
>> computing SUBobjects size more precisely before the subobject information 
>> lost?
> 
> I suppose it's more about being able to do it at all, rather than precision.

Without the subobject information in IR, our later object size phase uses the 
whole object size as an estimation as it currently does. -:)
> 
>> Then, I think whatever MIN or MAX, the early phase has more precise 
>> information than the later phase, we should use its result if it’s NOT 
>> UNKNOWN?
> 
> We can't be sure about that though, can we?  For example for something like 
> this:
> 
> struct S
> {
>  int a;
>  char b[10];
>  int c;
> };
> 
> size_t foo (struct S *s)
> {
>  return __builtin_object_size (s->b, 1);
> }
> 
> size_t bar ()
> {
>  struct S *in = malloc (8);
> 
>  return foo (in);
> }
> 
> returns 10 for __builtin_object_size in early_objsz but when it sees the 
> malloc in the later objsz pass, it returns 4:
> 
> $ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
> ...
> foo:
> .LFB0:
>       .cfi_startproc
>       movl    $10, %eax
>       ret
>       .cfi_endproc
> ...
> bar:
> .LFB1:
>       .cfi_startproc
>       movl    $4, %eax
>       ret
>       .cfi_endproc
> ...
> 
> In fact, this ends up returning the wrong result for OST_MINIMUM:
> 
> $ gcc/cc1 -fdump-tree-objsz-details -quiet -o - -O bug.c
> ...
> foo:
> .LFB0:
>       .cfi_startproc
>       movl    $10, %eax
>       ret
>       .cfi_endproc
> ...
> bar:
> .LFB1:
>       .cfi_startproc
>       movl    $10, %eax
>       ret
>       .cfi_endproc
> ...
> 
> bar ought to have returned 4 too (and I'm betting the later objsz must have 
> seen that) but it got overridden by the earlier estimate of 10.

Okay, I see. 

Then is this the similar issue we discussed previously?  (As following:)

"
> Hi, Sid and Jakub,
> I have a question in the following source portion of the routine 
> “addr_object_size” of gcc/tree-object-size.cc:
>  743       bytes = compute_object_offset (TREE_OPERAND (ptr, 0), var);
>  744       if (bytes != error_mark_node)
>  745         {
>  746           bytes = size_for_offset (var_size, bytes);
>  747           if (var != pt_var && pt_var_size && TREE_CODE (pt_var) == 
> MEM_REF)
>  748             {
>  749               tree bytes2 = compute_object_offset (TREE_OPERAND (ptr, 0),
>  750                                                    pt_var);
>  751               if (bytes2 != error_mark_node)
>  752                 {
>  753                   bytes2 = size_for_offset (pt_var_size, bytes2);
>  754                   bytes = size_binop (MIN_EXPR, bytes, bytes2);
>  755                 }
>  756             }
>  757         }
> At line 754, why we always use “MIN_EXPR” whenever it’s for OST_MINIMUM or 
> not?
> Shall we use
> (object_size_type & OST_MINIMUM
>                             ? MIN_EXPR : MAX_EXPR)

That MIN_EXPR is not for OST_MINIMUM.  It is to cater for allocations like this:

typedef struct
{
 int a;
} A;

size_t f()
{
 A *p = malloc (1);

 return __builtin_object_size (p, 0);
}

where the returned size should be 1 and not sizeof (int).  The mode doesn't 
really matter in this case.
“

If this is the same issue, I think we can use the same solution: always use 
MIN_EXPR, 
What do you think?

Qing

> 
> We probably need smarter heuristics on choosing between the estimate of the 
> early_objsz and late objsz passes each by itself isn't good enough for 
> subobjects.
> 
> Thanks,
> Sid

Reply via email to