> On Aug 18, 2023, at 12:00 PM, Qing Zhao via Gcc-patches 
> <gcc-patches@gcc.gnu.org> wrote:
> 
> 
> 
>> On Aug 17, 2023, at 5:32 PM, Siddhesh Poyarekar <siddh...@gotplt.org> wrote:
>> 
>> On 2023-08-17 17:25, Qing Zhao wrote:
>>>> It's not exactly the same issue, the earlier discussion was about choosing 
>>>> sizes in the same pass while the current one is about choosing between 
>>>> passes, but I agree it "rhymes".  This is what I was alluding to 
>>>> originally (for OST_MINIMUM use MIN_EXPR if both passes returned a pass) 
>>>> but I haven't thought about it hard enough to be 100% confident that it's 
>>>> the better solution, especially for OST_MAXIMUM.
>>> We have two different sources to get SIZE information for the subobject:
>>> 1. From the TYPESIZE information embedded in the IR;
>>> 2. From the initialization information propagated from data flow, this 
>>> includes both malloc call and the DECL_INIT.
>>> We need to choose between these two when both available, (these two 
>>> information could be
>>> in the same pass as we discussed before, or in different passes which is 
>>> shown in this discussion).
>>> I think that the MIN_EXPR might be the right choice (especially for 
>>> OST_MAXIMUM) -:)
>> 
>> It's worth a shot I guess.  We could emit something like the following in 
>> early_object_sizes_execute_one:
>> 
>> sz = (__bos(o->sub, ost) == unknown
>>       ? early_size
>>       : MIN_EXPR (__bos(o->sub, ost), early_size));
>> 
>> and see if it sticks.
> 
> I came up with the following change for tree-object-size.cc:
> 
> diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
> index a62af0500563..e1b2008c6dcc 100644
> --- a/gcc/tree-object-size.cc
> +++ b/gcc/tree-object-size.cc
> @@ -2016,10 +2016,22 @@ do_valueize (tree t)
>   return t;
> }
> 
> -/* Process a __builtin_object_size or __builtin_dynamic_object_size call in
> -   CALL early for subobjects before any object information is lost due to
> -   optimization.  Insert a MIN or MAX expression of the result and
> -   __builtin_object_size at I so that it may be processed in the second pass.
> +/* Process a __builtin_object_size or __builtin_dynamic_object_size call
> +   early for subobjects before any object information is lost due to
> +   optimization.
> +
> +   We have two different sources to get the size information for subobjects:
> +   A. The TYPE information of the subobject in the IR;
> +   B. The initialization information propagated through data flow.
> +   In the early pass, only A is available.
> +   B might be available in the second pass.
> +
> +   If both A and B are available, we should choose the minimum one between
> +   these two.
> +
> +   Insert a MIN expression of the result from the early pass and the original
> +   __builtin_object_size call at I so that it may be processed in the second 
> pass.
> +
>    __builtin_dynamic_object_size is treated like __builtin_object_size here
>    since we're only looking for constant bounds.  */
> 
> @@ -2036,7 +2048,7 @@ early_object_sizes_execute_one (gimple_stmt_iterator 
> *i, gimple *call)
>   unsigned HOST_WIDE_INT object_size_type = tree_to_uhwi (ost);
>   tree ptr = gimple_call_arg (call, 0);
> 
> -  if (object_size_type != 1 && object_size_type != 3)
> +  if (object_size_type & OST_SUBOBJECT == 0)
>     return;
> 
>   if (TREE_CODE (ptr) != ADDR_EXPR && TREE_CODE (ptr) != SSA_NAME)
> @@ -2050,9 +2062,8 @@ early_object_sizes_execute_one (gimple_stmt_iterator 
> *i, gimple *call)
> 
>   tree tem = make_ssa_name (type);
>   gimple_call_set_lhs (call, tem);
> -  enum tree_code code = object_size_type & OST_MINIMUM ? MAX_EXPR : MIN_EXPR;
>   tree cst = fold_convert (type, bytes);
> -  gimple *g = gimple_build_assign (lhs, code, tem, cst);
> +  gimple *g = gimple_build_assign (lhs, MIN_EXPR, tem, cst);
>   gsi_insert_after (i, g, GSI_NEW_STMT);
>   update_stmt (call);
> }
> 
> Let me know if you see any issue with the change.

I tested the above change, everything is fine except one testing case in 
gcc.dg/builtin-object-size-4.c

I reduced the failed case to the following small one:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* Tests for strdup/strndup.  */
size_t
__attribute__ ((noinline))
test9 (void)
{
  const char *ptr = "abcdefghijklmnopqrstuvwxyz";
  char *res = strndup (ptr, 21);
  int n = 0;
  if ((n = __builtin_object_size (res, 3)) != 22)
    printf("FAIL is %d\n", n);

  free (res);
}

int
main (void)
{
  test9 ();
}
[opc@qinzhao-ol8u3-x86 gcc]$ sh t
FAIL is 1

I debugged into tree-object-size.cc, the routine “strdup_object_size”, and have 
two questions on two places:

1. For the following:

 844   /* In all other cases, return the size of SRC since the object size 
cannot
 845      exceed that.  We cannot do this for OST_MINIMUM unless SRC points 
into a
 846      string constant since otherwise the object size could go all the way 
down
 847      to zero.  */
…
 864       /* For maximum estimate, our next best guess is the object size of 
the
 865          source.  */
 866       if (size_unknown_p (sz, object_size_type)
 867           && !(object_size_type & OST_MINIMUM))
 868         compute_builtin_object_size (src, object_size_type, &sz);

I still don’t understand why for OST_MINIMUM, we cannot call 
“compute_builtin_object_size” at line 868?

2. For the following:

 871   /* String duplication allocates at least one byte, so we should never 
fail
 872      for OST_MINIMUM.  */
 873   if ((!size_valid_p (sz, object_size_type)
 874        || size_unknown_p (sz, object_size_type))
 875       && (object_size_type & OST_MINIMUM))
 876     sz = size_one_node;

I checked the doc for strdup/strndup, cannot find anyplace mentioning the 
routine returns at least one byte.

Where the above come from?

thanks.

Qing


> thanks.
> 
> Qing
> 
>> 
>> Thanks,
>> Sid

Reply via email to