+josmy...@redhat.com

On Sun, Jul 20, 2025 at 4:20 AM <mo...@google.com> wrote:
>
> From: Bill Wendling <mo...@google.com>
>
> Also, this code doesn't go further than parsing. I.e., it doesn't generate the
> internal gimple code that accesses the struct fields. The code is meant to 
> show
> that it *is* possible to perform a delayed parsing with no "double parsing" 
> and
> still be performant.
>
> Minor Nomenclature Change
> -------------------------
>
> I (try to) use "bounds safety" instead of "counted_by" in the code. This is
> because Clang has many more bounds safety attributes in the pipeline and
> "bounds safety" seems like a better overall name for the set of attributes.
>
> Token-Based Delayed Parsing
> ---------------------------
>
> The solution introduces a token capture and delayed parsing mechanism that
> stores tokens during initial parsing and performs the actual parse later when
> the struct context is available. This isn't a novel idea, as the OMP code does
> something similar for directives.
>
> After storing the tokens, the parsing continues as if the expression was 
> parsed
> (but it hasn't been yet). The parsing is delayed until the
> 'verify_counted_by_attribute ()' call, which performs the actual parse and
> reports any semantic errors. (The actual parse is done simply by creating a 
> new
> 'c_parser' object and filling it with the delayed tokens.)
>
> On a successful parse, the 'C_TOKEN_VEC' tree node is converted into a tree
> node holding an expression. The token vector is freed afterwards.
>
> If this approach is deemed worthy, I would like to incorporate the "single
> identifier" parsing into the delayed parsing mechanism (a single identifier is
> just a simple expression).
>
> RFC
> ---
>
> - Is this something that GCC is interested in to get us past this hurdle
>   that's delaying the upstreaming of more bounds safety checks?
> - Will this solve the issues at hand? (For this implementation, I'm
>   specifically focusing struct field attributes. I haven't thought much about
>   parameter list attributes.)
> - I think there's a layering violation. The 'c-decl.cc' "verifies" the
>   counted_by argument, but it's in essence performing an identifier 
> resolution,
>   which is something the parser should be doing. Perhaps that code should be
>   moved into "finish_struct ()", or somewhere else in "c/c-parser.cc"?
> - My GCC-fu isn't strong and I may have abused some parts of the parser in
>   unholy ways. (I blame beer.)
> - Anything else you'd like to say or recommend.
>
> Disclaimer
> ----------
>
> No AI's were harmed in the making of this RFC, though one did have its 
> feelings
> hurt.
>
> Share and enjoy!
> -bw
>
>
> [1] 
> https://discourse.llvm.org/t/rfc-bounds-safety-in-c-syntax-compatibility-with-gcc/85885
>
> ---
> To: gcc-patches@gcc.gnu.org
> Cc: Kees Cook <k...@kernel.org>
> Cc: Nick Desaulniers <nick.desaulni...@gmail.com>
> Cc: Martin Uecker <uec...@tugraz.at>
> Cc: Qing Zhao <qing.z...@oracle.com>
> Cc: Jakub Jelinek <ja...@redhat.com>
> Cc: Richard Biener <richard.guent...@gmail.com>
> Cc: Yeoul Na <yeoul...@apple.com>
> Cc: John McCall <rjmcc...@apple.com>
> Cc: Aaron Ballman <aa...@aaronballman.com>
> Cc: Justin Stitt <justinst...@google.com>
>
> v1: - Initial RFC submitted.
>     - Kees Cook's Linus asbestos suit borrowed.
> ---
>  gcc/attribs.cc            |   8 ++-
>  gcc/c-family/c-attribs.cc |  37 +++++++++---
>  gcc/c/c-decl.cc           |  97 ++++++++++++++++++++++--------
>  gcc/c/c-parser.cc         | 123 ++++++++++++++++++++++++++++++++++++--
>  gcc/c/c-parser.h          |   1 +
>  gcc/c/c-typeck.cc         |  21 +++++++
>  6 files changed, 246 insertions(+), 41 deletions(-)
>
> diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> index 3fce9d625256..74001889c72a 100644
> --- a/gcc/attribs.cc
> +++ b/gcc/attribs.cc
> @@ -752,8 +752,10 @@ decl_attributes (tree *node, tree attributes, int flags,
>         }
>        else
>         {
> -         int nargs = list_length (args);
> -         if (nargs < spec->min_length
> +          int nargs = args && TREE_CODE (args) == C_TOKEN_VEC
> +                          ? 1
> +                          : list_length (args);
> +          if (nargs < spec->min_length
>               || (spec->max_length >= 0
>                   && nargs > spec->max_length))
>             {
> @@ -771,7 +773,7 @@ decl_attributes (tree *node, tree attributes, int flags,
>                         spec->min_length, spec->max_length, nargs);
>               continue;
>             }
> -       }
> +        }
>        gcc_assert (is_attribute_p (spec->name, name));
>
>        if (spec->decl_required && !DECL_P (*anode))
> diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
> index 1f4a0df12051..0e8c17440b9a 100644
> --- a/gcc/c-family/c-attribs.cc
> +++ b/gcc/c-family/c-attribs.cc
> @@ -2887,7 +2887,14 @@ handle_counted_by_attribute (tree *node, tree name,
>                              bool *no_add_attrs)
>  {
>    tree decl = *node;
> -  tree argval = TREE_VALUE (args);
> +  tree argval;
> +
> +  /* Handle both C_TOKEN_VEC and regular argument formats.  */
> +  if (TREE_CODE (args) == C_TOKEN_VEC)
> +    argval = args;
> +  else
> +    argval = TREE_VALUE (args);
> +
>    tree old_counted_by = lookup_attribute ("counted_by", DECL_ATTRIBUTES 
> (decl));
>
>    /* This attribute is not supported in C++.  */
> @@ -2922,11 +2929,13 @@ handle_counted_by_attribute (tree *node, tree name,
>                 " array member field", name);
>        *no_add_attrs = true;
>      }
> -  /* The argument should be an identifier.  */
> -  else if (TREE_CODE (argval) != IDENTIFIER_NODE)
> +  /* The argument should be an identifier or a C_TOKEN_VEC for complex
> +     expressions.  */
> +  else if (TREE_CODE (argval) != IDENTIFIER_NODE
> +           && TREE_CODE (argval) != C_TOKEN_VEC)
>      {
>        error_at (DECL_SOURCE_LOCATION (decl),
> -               "%<counted_by%> argument is not an identifier");
> +                "%<counted_by%> argument is not an identifier or 
> expression");
>        *no_add_attrs = true;
>      }
>    /* Issue error when there is a counted_by attribute with a different
> @@ -2934,14 +2943,28 @@ handle_counted_by_attribute (tree *node, tree name,
>    else if (old_counted_by != NULL_TREE)
>      {
>        tree old_fieldname = TREE_VALUE (TREE_VALUE (old_counted_by));
> -      if (strcmp (IDENTIFIER_POINTER (old_fieldname),
> -                 IDENTIFIER_POINTER (argval)) != 0)
> -       {
> +      /* Only check conflicts for simple identifiers; complex expressions
> +         will be validated during struct completion.  */
> +      if (TREE_CODE (argval) == IDENTIFIER_NODE
> +          && TREE_CODE (old_fieldname) == IDENTIFIER_NODE
> +          && strcmp (IDENTIFIER_POINTER (old_fieldname),
> +                     IDENTIFIER_POINTER (argval))
> +                 != 0)
> +        {
>           error_at (DECL_SOURCE_LOCATION (decl),
>                     "%<counted_by%> argument %qE conflicts with"
>                     " previous declaration %qE", argval, old_fieldname);
>           *no_add_attrs = true;
>         }
> +      else if (TREE_CODE (argval) == C_TOKEN_VEC
> +               || TREE_CODE (old_fieldname) == C_TOKEN_VEC)
> +        {
> +          /* For complex expressions, defer conflict checking to struct
> +             completion.  */
> +          warning_at (DECL_SOURCE_LOCATION (decl), OPT_Wattributes,
> +                      "multiple %<counted_by%> attributes on the same field; 
> "
> +                      "only the last one will be used");
> +        }
>      }
>
>    return NULL_TREE;
> diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
> index acbe2b88e674..615d0f5a1e88 100644
> --- a/gcc/c/c-decl.cc
> +++ b/gcc/c/c-decl.cc
> @@ -9445,43 +9445,90 @@ verify_counted_by_attribute (tree struct_type, tree 
> field_decl)
>    if (!attr_counted_by)
>      return;
>
> -  /* If there is an counted_by attribute attached to the field,
> +  /* If there is a counted_by attribute attached to the field,
>       verify it.  */
>
> -  tree fieldname = TREE_VALUE (TREE_VALUE (attr_counted_by));
> +  tree argument;
> +  tree parsed_expr = NULL_TREE;
>
> -  /* Verify the argument of the attrbute is a valid field of the
> -     containing structure.  */
> -
> -  tree counted_by_field = lookup_field (struct_type, fieldname);
> -
> -  /* Error when the field is not found in the containing structure and
> -     remove the corresponding counted_by attribute from the field_decl.  */
> -  if (!counted_by_field)
> -    {
> -      error_at (DECL_SOURCE_LOCATION (field_decl),
> -               "argument %qE to the %<counted_by%> attribute"
> -               " is not a field declaration in the same structure"
> -               " as %qD", fieldname, field_decl);
> -      DECL_ATTRIBUTES (field_decl)
> -       = remove_attribute ("counted_by", DECL_ATTRIBUTES (field_decl));
> -    }
> +  /* Handle both C_TOKEN_VEC and regular argument formats */
> +  tree args = TREE_VALUE (attr_counted_by);
> +  if (TREE_CODE (args) == C_TOKEN_VEC)
> +    argument = args; /* Direct C_TOKEN_VEC */
>    else
> -  /* Error when the field is not with an integer type.  */
> +    argument = TREE_VALUE (args); /* Standard tree list format */
> +
> +  /* Handle both simple identifiers and complex expressions.  */
> +  if (TREE_CODE (argument) == IDENTIFIER_NODE)
>      {
> +      /* Simple identifier case - existing logic.  */
> +      tree counted_by_field = lookup_field (struct_type, argument);
> +
> +      if (!counted_by_field)
> +        {
> +          error_at (DECL_SOURCE_LOCATION (field_decl),
> +                    "argument %qE to the %<counted_by%> attribute"
> +                    " is not a field declaration in the same structure"
> +                    " as %qD",
> +                    argument, field_decl);
> +          DECL_ATTRIBUTES (field_decl)
> +              = remove_attribute ("counted_by", DECL_ATTRIBUTES 
> (field_decl));
> +          return;
> +        }
> +
> +      /* Find the actual field declaration.  */
>        while (TREE_CHAIN (counted_by_field))
>         counted_by_field = TREE_CHAIN (counted_by_field);
>        tree real_field = TREE_VALUE (counted_by_field);
>
>        if (!INTEGRAL_TYPE_P (TREE_TYPE (real_field)))
>         {
> -         error_at (DECL_SOURCE_LOCATION (field_decl),
> -                   "argument %qE to the %<counted_by%> attribute"
> -                   " is not a field declaration with an integer type",
> -                   fieldname);
> -         DECL_ATTRIBUTES (field_decl)
> +          error_at (DECL_SOURCE_LOCATION (field_decl),
> +                    "argument %qE to the %<counted_by%> attribute"
> +                    " is not a field declaration with an integer type",
> +                    argument);
> +          DECL_ATTRIBUTES (field_decl)
> +              = remove_attribute ("counted_by", DECL_ATTRIBUTES 
> (field_decl));
> +          return;
> +        }
> +    }
> +  else if (TREE_CODE (argument) == C_TOKEN_VEC)
> +    {
> +      /* Complex expression case - parse the stored tokens.  */
> +      parsed_expr
> +          = c_parse_token_vec_expression_with_struct (argument, struct_type);
> +
> +      if (!parsed_expr)
> +        {
> +          error_at (DECL_SOURCE_LOCATION (field_decl),
> +                    "invalid expression in %<counted_by%> attribute");
> +          DECL_ATTRIBUTES (field_decl)
>             = remove_attribute ("counted_by", DECL_ATTRIBUTES (field_decl));
> -       }
> +          return;
> +        }
> +
> +      /* Validate that the expression result is integer type.  */
> +      if (!INTEGRAL_TYPE_P (TREE_TYPE (parsed_expr)))
> +        {
> +          error_at (DECL_SOURCE_LOCATION (field_decl),
> +                    "expression in %<counted_by%> attribute does not"
> +                    " have integer type");
> +          DECL_ATTRIBUTES (field_decl)
> +              = remove_attribute ("counted_by", DECL_ATTRIBUTES 
> (field_decl));
> +          return;
> +        }
> +
> +      /* If we successfully parsed the expression, replace the C_TOKEN_VEC
> +         with the parsed expression for later use.  */
> +      TREE_VALUE (attr_counted_by) = build_tree_list (NULL_TREE, 
> parsed_expr);
> +    }
> +  else
> +    {
> +      error_at (DECL_SOURCE_LOCATION (field_decl),
> +                "invalid argument to %<counted_by%> attribute");
> +      DECL_ATTRIBUTES (field_decl)
> +          = remove_attribute ("counted_by", DECL_ATTRIBUTES (field_decl));
> +      return;
>      }
>  }
>
> diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
> index 5119841a5895..a9c641151d39 100644
> --- a/gcc/c/c-parser.cc
> +++ b/gcc/c/c-parser.cc
> @@ -5623,6 +5623,30 @@ c_parser_attribute_arguments (c_parser *parser, bool 
> takes_identifier,
>    return attr_args;
>  }
>
> +static bool
> +c_parser_bounds_safety_attr (tree attr_name)
> +{
> +  const char *name = IDENTIFIER_POINTER (attr_name);
> +  return !strcmp (name, "counted_by");
> +}
> +
> +/* Check if the bounds safety attribute needs complex expression parsing.  */
> +static bool
> +c_parser_needs_complex_bounds_parsing (c_parser *parser)
> +{
> +  /* Look ahead to see if this is a simple identifier or a complex
> +     expression.  */
> +  c_token *tok = c_parser_peek_token (parser);
> +  if (tok->type != CPP_NAME)
> +    return true;
> +
> +  c_token *next = c_parser_peek_2nd_token (parser);
> +  if (next->type == CPP_CLOSE_PAREN)
> +    return false;
> +
> +  return true;
> +}
> +
>  /* Parse (possibly empty) gnu-attributes.  This is a GNU extension.
>
>     gnu-attributes:
> @@ -5692,12 +5716,56 @@ c_parser_gnu_attribute (c_parser *parser, tree attrs,
>      }
>    c_parser_consume_token (parser);
>
> -  tree attr_args
> -    = c_parser_attribute_arguments (parser,
> -                                   attribute_takes_identifier_p (attr_name),
> -                                   false,
> -                                   is_attribute_p ("assume", attr_name),
> -                                   true);
> +  tree attr_args;
> +  if (!c_parser_bounds_safety_attr (attr_name)
> +      || !c_parser_needs_complex_bounds_parsing (parser))
> +    attr_args = c_parser_attribute_arguments (
> +        parser, attribute_takes_identifier_p (attr_name), false,
> +        is_attribute_p ("assume", attr_name), true);
> +  else
> +    {
> +      unsigned int n = 1;
> +      unsigned int paren_count = 0;
> +
> +      for (; true; ++n)
> +        {
> +          c_token *tok = c_parser_peek_nth_token (parser, n);
> +
> +          /* Safety check to avoid infinite loops.  */
> +          if (tok->type == CPP_EOF)
> +            {
> +              c_parser_error (parser, "expected identifier or expression");
> +              return error_mark_node;
> +            }
> +
> +          if (tok->type == CPP_CLOSE_PAREN)
> +            {
> +              if (!paren_count)
> +                break;
> +              paren_count--;
> +            }
> +          else if (tok->type == CPP_OPEN_PAREN)
> +            paren_count++;
> +          else if (tok->type == CPP_SEMICOLON)
> +            {
> +              c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
> +                                         "expected identifier or 
> expression");
> +              return error_mark_node;
> +            }
> +        }
> +
> +      vec<c_token, va_gc> *v;
> +      vec_alloc (v, n - 1);
> +      for (--n; n; --n)
> +        {
> +          c_token *tok = c_parser_peek_token (parser);
> +          v->quick_push (*tok);
> +          c_parser_consume_token (parser);
> +        }
> +
> +      attr_args = make_node (C_TOKEN_VEC);
> +      C_TOKEN_VEC_TOKENS (attr_args) = v;
> +    }
>
>    attr = build_tree_list (attr_name, attr_args);
>    if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN))
> @@ -27746,6 +27814,49 @@ c_maybe_parse_omp_decl (tree decl, tree d)
>    return true;
>  }
>
> +/* Context for parsing counted_by expressions within struct scope.  */
> +tree bounds_safety_attr_struct_type = NULL_TREE;
> +
> +/* Parse an expression from a C_TOKEN_VEC.  This is used for delayed
> +   parsing of counted_by attribute expressions.  */
> +
> +tree
> +c_parse_token_vec_expression (tree token_vec)
> +{
> +  vec<c_token, va_gc> *tokens = C_TOKEN_VEC_TOKENS (token_vec);
> +
> +  if (!tokens || vec_safe_is_empty (tokens))
> +    return NULL_TREE;
> +
> +  /* Create a parser for just the list of tokens.  */
> +  c_parser delay_parser = {};
> +  delay_parser.tokens = tokens->address ();
> +  delay_parser.tokens_avail = tokens->length ();
> +  delay_parser.raw_tokens = tokens;
> +
> +  /* Parse the expression.  */
> +  c_expr expr = c_parser_expr_no_commas (&delay_parser, NULL);
> +
> +  if (expr.value == error_mark_node)
> +    return NULL_TREE;
> +
> +  return expr.value;
> +}
> +
> +/* Parse an expression from a C_TOKEN_VEC with struct context.  */
> +
> +tree
> +c_parse_token_vec_expression_with_struct (tree token_vec, tree struct_type)
> +{
> +  tree saved_struct_type = bounds_safety_attr_struct_type;
> +  bounds_safety_attr_struct_type = struct_type;
> +
> +  tree result = c_parse_token_vec_expression (token_vec);
> +
> +  bounds_safety_attr_struct_type = saved_struct_type;
> +  return result;
> +}
> +
>  /* OpenMP 4.0:
>     # pragma omp declare target new-line
>     declarations and definitions
> diff --git a/gcc/c/c-parser.h b/gcc/c/c-parser.h
> index a84779bcbf83..577644cef13a 100644
> --- a/gcc/c/c-parser.h
> +++ b/gcc/c/c-parser.h
> @@ -205,5 +205,6 @@ extern void c_parser_declspecs (c_parser *, struct 
> c_declspecs *, bool, bool,
>                                 enum c_lookahead_kind);
>  extern struct c_type_name *c_parser_type_name (c_parser *, bool = false);
>  extern bool c_maybe_parse_omp_decl (tree, tree);
> +extern tree c_parse_token_vec_expression_with_struct (tree, tree);
>
>  #endif
> diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
> index f161bd9d0e74..788c0fe3763f 100644
> --- a/gcc/c/c-typeck.cc
> +++ b/gcc/c/c-typeck.cc
> @@ -56,6 +56,9 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-pretty-print-markup.h"
>  #include "gcc-urlifier.h"
>
> +/* Forward declaration for bounds safety attribute context.  */
> +extern tree bounds_safety_attr_struct_type;
> +
>  /* Possible cases of implicit conversions.  Used to select diagnostic 
> messages
>     and control folding initializers in convert_for_assignment.  */
>  enum impl_conv {
> @@ -3528,6 +3531,24 @@ build_external_ref (location_t loc, tree id, bool fun, 
> tree *type)
>      return error_mark_node;
>    else
>      {
> +      /* Check if we're parsing a counted_by expression and can resolve
> +         the identifier as a struct field.  */
> +      if (bounds_safety_attr_struct_type)
> +        {
> +          tree field = lookup_field (bounds_safety_attr_struct_type, id);
> +          if (field)
> +            {
> +              /* We found a field - create a reference to it */
> +              while (TREE_CHAIN (field))
> +                field = TREE_CHAIN (field);
> +              tree real_field = TREE_VALUE (field);
> +
> +              ref = real_field;
> +              *type = TREE_TYPE (ref);
> +              return ref;
> +            }
> +        }
> +
>        undeclared_variable (loc, id);
>        return error_mark_node;
>      }

Reply via email to