[PING][PATCH] improve validation of attribute arguments (PR c/78666)

Martin Sebor via Gcc-patches Wed, 09 Sep 2020 14:45:23 -0700

Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552500.html


Aldy provided a bunch of comments on this patch but I'm still looking
for a formal approval.

Martin

On 8/24/20 10:45 AM, Martin Sebor wrote:

On 8/24/20 4:59 AM, Aldy Hernandez wrote:

On 8/21/20 1:37 AM, Martin Sebor wrote:
On 8/20/20 3:00 PM, Aldy Hernandez wrote:
Regardless, here are some random comments.
Thanks for the careful review!
diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 37214831538..bc4f409e346 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -720,6 +725,124 @@ positional_argument (const_tree fntype,const_tree atname, tree pos,
   return pos;
 }
+/* Given a pair of NODEs for arbitrary DECLs or TYPEs, validateone or+ two integral or string attribute arguments NEWARGS to beapplied to+ NODE[0] for the absence of conflicts with the same attributearguments+ already applied to NODE[1]. Issue a warning for conflicts andreturn
+   false.  Otherwise, when no conflicts are found, return true.  */
+
+static bool
+validate_attr_args (tree node[2], tree name, tree newargs[2])
I think you're doing too much work in one function. Also, I*really* dislike sending pairs of objects in arrays, especially whenthey're called something so abstract as "node" and "newargs".
Would it be possible to make this function only validate one singleargument and call it twice? Or do we gain something by having it dotwo things at once?
I agree about the name "node."  The argument comes from the attribute
handlers: they all take something called a node as their first argument.
It's an array of three elements:
   [0] the current declaration or type
   [1] the previous declaration or type or null
   [2] the current declaration if [0] is a type
Ah, the rest of the functions are taking a node pointer. Your patchthrew me off because you use a node[2] instead of a node pointer likethe rest of the functions. Perhaps you should keep to the currentstyle and pass a node *.


It takes tree node[3] and the -Warray-parameter option (being
reviewed) uses the bound to check for out-of-bounds accesses, both
callers and the callee itself.  (C, but not C++, has special syntax
for this: tree node[static 3].)

validate_attr_args() is called with the same node as the handlers
and uses both node[0] and node[1] to recursively validate the first
one against itself and then against the second.  It could be changed
to take two arguments instead of an array (the new "node" and
the original "node," perhaps even under some better name).  That
would make it different from every handler but maybe that wouldn't
be a problem.

The newargs argument is also an array, with the second element
being optional.  Both elements are used and validated against
the attribute arguments on the same declaration first and again
on the previous one.  The array could be split up into two
distinct arguments, say newarg1 and newarg2, or passed in as
std::pair<tree, tree>.  I'm not sure I see much of a difference
between the approaches.

It looks like node[] carries all the information for the currentattribute and arguments, as well the same information for the previousattribute. Could your validate function just take:


validate_attr_args (tree *node, tree name)

That way you can save passing a pair of arguments, plus you can saveaccumulating said arguments in the handle_* functions.


Or is there something I'm missing here that makes this unfeasible?


If the function didn't the newargs array it would have to extract
the argument(s) from node, duplicating the work already done in
the callers.  I.e., figuring out how many arguments the attribute
expects (one or two, depending on the specific attribute), and for
handle_alloc_size_attribute, calling positional_argument (or at
a minimum default_conversion) to convert it to the expected value.
So it's feasible but doesn't seem like a good design.

/* Extract the same attribute from the previous declaration ortype. */
  tree prevattr = NULL_TREE;
  if (DECL_P (node[1]))
    {
      prevattr = DECL_ATTRIBUTES (node[1]);
      if (!prevattr)
    {
      tree type = TREE_TYPE (node[1]);
      prevattr = TYPE_ATTRIBUTES (type);
    }
    }
  else if (TYPE_P (node[1]))
    prevattr = TYPE_ATTRIBUTES (node[1]);
If all this section does is extract the attribute from a decl, itwould look cleaner if you abstracted out this functionality into itsown function. I'm a big fan of one function to do one thing.
  const char* const namestr = IDENTIFIER_POINTER (name);
  prevattr = lookup_attribute (namestr, prevattr);
  if (!prevattr)
    return true;
Perhaps a better name would be attribute_name_str?


Thanks for the suggestion but I think NAMESTR is good enough: it
should be clear enough from the function argument NAME that it
refers to the string representation of the NAME.  There also is
already a pre-existing use of NAMESTR elsewhere and so a precedent
for this choice.

  /* Extract one or both attribute arguments.  */
  tree prevargs[2];
  prevargs[0] = TREE_VALUE (TREE_VALUE (prevattr));
  prevargs[1] = TREE_CHAIN (TREE_VALUE (prevattr));
  if (prevargs[1])
    prevargs[1] = TREE_VALUE (prevargs[1]);
Does this only work with 2 arguments? What if the attribute takes 3or 4 arguments? We should make this generic enough to handle anynumber of arguments.


It only works with two arguments because one- and two-argument
attribute handlers are the only callers it's suitable for today.
If/when there is a use case for validating more than two arguments
in a general way the function will need to be extended.

At the moment, there are only a handful of attributes that take
three or more arguments: access, format, nonnull and optimize.
Most do their own extensive validation and the simple checking
done by this function isn't really suitable to them, in part
because they can be meaningfully applied to add attributes to
other arguments between one declaration and another.  I'm not
sure that generalizing validate_attr_args to try to also validate
these complex attributes would be an improvement over the status
quo.  Even if it could be done, it's more ambitious of a project
than what this simple change is trying to do.

  /* Both arguments must be equal or, for the second pair, neither must
     be provided to succeed.  */
  bool arg1eq, arg2eq;
  if (TREE_CODE (newargs[0]) == INTEGER_CST)
    {
      arg1eq = tree_int_cst_equal (newargs[0], prevargs[0]);
      if (newargs[1] && prevargs[1])
    arg2eq = tree_int_cst_equal (newargs[1], prevargs[1]);
      else
    arg2eq = newargs[1] == prevargs[1];
    }
  else if (TREE_CODE (newargs[0]) == STRING_CST)
    {
      const char *s0 = TREE_STRING_POINTER (newargs[0]);
      const char *s1 = TREE_STRING_POINTER (prevargs[0]);
      arg1eq = strcmp (s0, s1) == 0;
      if (newargs[1] && prevargs[1])
    {
      s0 = TREE_STRING_POINTER (newargs[1]);
      s1 = TREE_STRING_POINTER (prevargs[1]);
      arg2eq = strcmp (s0, s1) == 0;
    }
      else
    arg2eq = newargs[1] == prevargs[1];
    }


It looks like you're re-inventing operand_equal_p.


operand_equal_p does a lot more than just compare integer and string
constants for equality, so it has lots more overhead.  That said, I'm
very much in favor of code reuse even if comes with a (small) cost so
I tried this:

   /* Both arguments must be equal or, for the second pair, neither must
      be provided to succeed.  */
   bool arg1eq = newargs[0] == prevargs[0];
   if (!arg1eq)
     arg1eq = operand_equal_p (newargs[0], prevargs[0]);
   bool arg2eq = newargs[1] == prevargs[1];
   if (!arg2eq && newargs[1] && prevargs[1])
     arg2eq = operand_equal_p (newargs[1], prevargs[1]);
   if (arg1eq && arg2eq)
     return true;

Some testing revealed that the code has different semantics for
strings: it compares them including all nuls embedded in them,
while I think the right semantics for attributes is to only consider
strings up and including the first nul (i.e., apply the usual string
semantics).  A test case for this corner case is as follows:

   __attribute__ ((section ("foo\0bar"))) void f (void);
   __attribute__ ((section ("foo"))) void f (void) { }

Without operand_equal_p() GCC accepts this and puts f in section
foo.  With the operand_equal_p() change above, it complains:

ignoring attribute ‘section ("foo\x00bar")’ because it conflicts withprevious ‘section ("foor")’ [-Wattributes]


I would rather not change the behavior in this corner case just to
save a few lines of code.  If it's thought that string arguments
to attributes (some or all) should be interpreted differently than
in other contexts it seems that such a change should be made
separately and documented in the manual.

  /* If the two locations are different print a note pointing to
     the previous one.  */
  const location_t curloc = input_location;
  const location_t prevloc =
    DECL_P (node[1]) ? DECL_SOURCE_LOCATION (node[1]) : curloc;

  /* Format the attribute specification for convenience.  */
  char newspec[80], prevspec[80];
  if (newargs[1])
    snprintf (newspec, sizeof newspec, "%s (%s, %s)", namestr,
          print_generic_expr_to_str (newargs[0]),
          print_generic_expr_to_str (newargs[1]));
  else
    snprintf (newspec, sizeof newspec, "%s (%s)", namestr,
          print_generic_expr_to_str (newargs[0]));

  if (prevargs[1])
    snprintf (prevspec, sizeof prevspec, "%s (%s, %s)", namestr,
          print_generic_expr_to_str (prevargs[0]),
          print_generic_expr_to_str (prevargs[1]));
  else
    snprintf (prevspec, sizeof prevspec, "%s (%s)", namestr,
          print_generic_expr_to_str (prevargs[0]));

  if (warning_at (curloc, OPT_Wattributes,
          "ignoring attribute %qs because it conflicts "
          "with previous %qs",
          newspec, prevspec)
      && curloc != prevloc)
    inform (prevloc, "previous declaration here");

I think it would look cleaner if you separated the analysis from thediagnosing. So perhaps one function that checks if the attribute isvalid (returns true or false), and one function to display all thesewarnings.


It's common practice in GCC to do both in the same function:
validate operands and issue warnings for warnings for problems.
In fact, the vast majority of attribute handlers do three things
in the same function: parse, validate, and apply attributes.  It
can be helpful to split up big functions the way you suggest but
I think this one is fine the way it is.

Speak of which, I don't know what the consensus was, but if weabsolutely know that a re-declaration of an attribute for a decl isinvalid, I think it should be a hard error.
Is there any case where having conflicting section attributes for adecl shouldn't merit an error?


I can't think of one.  The manual seems clear that an error should
be expected "for incorrect use of supported attributes."  Despite
that, there is no consistency in how these things are handled.
Some attribute handlers issue -Wattributes, others give hard errors,
some one or the other under different conditions.

My opinion is that there should be at least two distinct kinds of
diagnostics for attributes, perhaps three: one for the use of
unknown attributes, another for invalid values of arguments to
known attributes, and may also one for invalid syntax (forms,
types, or numbers of arguments) of known attributes.  At some
point I'd like to implement and propose this approach but I don't
think this change is the right time for it.

Attached is another revision of this patch with the new helper
function you suggested.  Is this version okay to commit?

Martin

[PING][PATCH] improve validation of attribute arguments (PR c/78666)

Reply via email to