On Wed, 2022-06-29 at 17:39 +0200, Tim Lange wrote:

> Hi,

Thanks for the updated patch.

Overall, looks nearly ready; various nits inline below, throughout...

> 
> I've addressed most of the points from the review.
> * The allocation size warning warns whenever region_model::get_capacity 
> returns
> something useful, i.e. also on statically-allocated regions.

Thanks.  Looks like you added test coverage for this in allocation-
size-5.c

> * I added a new virtual function to the pending-diagnostic class, so
that it
> is possible to emit a custom region creation description.
> * The test cases should have a better coverage now.
> * Conservative struct handling
> 
> The warning now looks like this:
> /path/to/main.c:9:8: warning: allocated buffer size is not a multiple of the 
> pointee's size [CWE-131] [-Wanalyzer-allocation-size]
>     9 |   int *iptr = ptr;
>       |        ^~~~
>   ‘main’: events 1-2
>     |
>     |    8 |   void *ptr = malloc((long unsigned int)n * sizeof(short));
>     |      |               ^~~~~~~~~~~~~~~~~~~~~~~~~
>     |      |               |
>     |      |               (1) allocated ‘(long unsigned int)n * 2’ bytes here
>     |    9 |   int *iptr = ptr;
>     |      |        ~~~~    
>     |      |        |
>     |      |        (2) assigned to ‘int *’ here; ‘sizeof(int)’ is ‘4’
>     |

Looks great.

> 
> /path/to/main.c:15:15: warning: allocated buffer size is not a multiple of 
> the pointee's size [CWE-131] [-Wanalyzer-allocation-size]
>    15 |   int *ptrw = malloc (sizeof (short));
>       |               ^~~~~~~~~~~~~~~~~~~~~~~
>   ‘main’: events 1-2
>     |
>     |   15 |   int *ptrw = malloc (sizeof (short));
>     |      |               ^~~~~~~~~~~~~~~~~~~~~~~
>     |      |               |
>     |      |               (1) allocated ‘2’ bytes here

Looks a bit weird to be quoting a number here; maybe whenever the
expression is just a constant, print it unquoted?  (though that could
be fiddly to implement, so can be ignored if it turns out to be) .


>     |      |               (2) assigned to ‘int *’ here; ‘sizeof (int)’ is ‘4’
>     |
> The only thing I couldn't address was moving the second event toward the lhs 
> or
> assign token here. I tracked it down till get_stmt_location where it seems 
> that
> the rhs is actually the location of the statement. Is there any way to get (2)
> to be focused on the lhs?

Annoyingly, we've lost a lot of location information by the time the
analyzer runs.

In theory we could special-case for when we have the def-stmt of the
SSA_NAME that's that default (i.e. initial) value of a VAR_DECL, and if
we see the write is there, we could use the DECL_SOUCE_LOCATION of the
VAR_DECL for the write, so that we'd get:

    |   15 |   int *ptrw = malloc (sizeof (short));
    |      |        ^~~~   ^~~~~~~~~~~~~~~~~~~~~~~
    |      |        |      |
    |      |        |      (1) allocated ‘2’ bytes here
    |      |        (2) assigned to ‘int *’ here; ‘sizeof (int)’ is ‘4’
    |

which is perhaps slightly more readable.  I'm not sure it's worth it
though.

> 
> Otherwise, the patch compiled coreutils, openssh, curl and httpd without any
> false-positive (but none of them contained a bug found by the checker either).

Great.

> `make check-gcc RUNTESTFLAGS="analyzer.exp"` tests pass and as I just worked 
> on
> the event splitting, the regression tests are yet to run.
> 
> - Tim
> 
> 
> This patch adds an checker that warns about code paths in which a buffer is
> assigned to a incompatible type, i.e. when the allocated buffer size is not a
> multiple of the pointee's size.
> 
> gcc/analyzer/ChangeLog:

You should add a reference to the RFE bug to the top of the ChangeLog entries:
          PR analyzer/105900

Please also add it to the commit message, in the form " [PR105900]";
see the examples section twoards the end of
https://gcc.gnu.org/contribute.html#patches


> 
>         * analyzer.opt: Added Wanalyzer-allocation-size.

[...snip...]

> 
> gcc/ChangeLog:

...and here

> 
>         * doc/invoke.texi: Added Wanalyzer-allocation-size.
> 
> gcc/testsuite/ChangeLog:

...and here

> 
>         * gcc.dg/analyzer/pr96639.c: Changed buffer size to omit warning.
>         * gcc.dg/analyzer/allocation-size-1.c: New test.
>         * gcc.dg/analyzer/allocation-size-2.c: New test.
>         * gcc.dg/analyzer/allocation-size-3.c: New test.
>         * gcc.dg/analyzer/allocation-size-4.c: New test.
>         * gcc.dg/analyzer/allocation-size-5.c: New test.
> 
> Signed-off-by: Tim Lange <m...@tim-lange.me>

[...snip...]

> diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt
> index 4aea52d3a87..912def2faf2 100644
> --- a/gcc/analyzer/analyzer.opt
> +++ b/gcc/analyzer/analyzer.opt
> @@ -54,6 +54,10 @@ The minimum number of supernodes within a function for the 
> analyzer to consider
>  Common Joined UInteger Var(param_analyzer_max_enodes_for_full_dump) 
> Init(200) Param
>  The maximum depth of exploded nodes that should appear in a dot dump before 
> switching to a less verbose format.
>  
> +Wanalyzer-allocation-size
> +Common Var(warn_analyzer_allocation_size) Init(1) Warning
> +Warn about code paths in which a buffer is assigned to a incompatible type.

Reword "buffer" to "pointer to a buffer", I think.

"a incompatible" -> "an incompatible"

[...snip...]

> +/* Concrete subclass for casts of pointers that lead to trailing bytes.  */
> +
> +class dubious_allocation_size
> +: public pending_diagnostic_subclass<dubious_allocation_size>
> +{
> +public:
> +  dubious_allocation_size (const region *lhs, const region *rhs)
> +  : m_lhs (lhs), m_rhs (rhs), m_expr (NULL_TREE)
> +  {}

[...snip...]

> +  bool operator== (const dubious_allocation_size &other) const
> +  {
> +    return m_lhs == other.m_lhs && m_rhs == other.m_rhs;

Probably should also check that:
  same_tree_p (m_expr, other.m_expr);

[...snip...]

> +/* Return true on dubious allocation sizes for constant sizes.  */
> +
> +static bool
> +capacity_compatible_with_type (tree cst, tree pointee_size_tree,
> +                            bool is_struct)
> +{
> +  unsigned HOST_WIDE_INT pointee_size = TREE_INT_CST_LOW (pointee_size_tree);
> +  if (pointee_size == 0)
> +    return 0;

"false" rather than 0, given that this is bool.

[...snip...]

> +/* Return true if the lhs and rhs of an assignment have different types.  */
> +
> +static bool
> +is_any_cast_p (const gimple *stmt)
> +{
> +  if (const gassign *assign = dyn_cast<const gassign *>(stmt))
> +    return gimple_assign_cast_p (assign)
> +       || (gimple_num_ops (assign) == 2
> +           && !pending_diagnostic::same_tree_p (
> +                                 TREE_TYPE (gimple_assign_lhs (assign)),
> +                                 TREE_TYPE (gimple_assign_rhs1 (assign))));

The "== 2" subclause in the above condition doesn't look quite right to
me; what statements did you encounter that needed this?

[...snip...]

> @@ -9758,6 +9759,18 @@ By default, the analysis silently stops if the code is 
> too
>  complicated for the analyzer to fully explore and it reaches an internal
>  limit.  The @option{-Wanalyzer-too-complex} option warns if this occurs.
>  
> +@item -Wno-analyzer-allocation-size
> +@opindex Wanalyzer-allocation-size
> +@opindex Wno-analyzer-allocation-size
> +This warning requires @option{-fanalyzer}, which enables it; use
> +@option{-Wno-analyzer-allocation-size}
> +to disable it.
> +
> +This diagnostic warns for paths through the code in which a buffer is casted
> +to a type where the buffer size is not a multiple of the pointee size.

At the risk of bikeshedding, how about:

This diagnostic warns for paths through the code in which a pointer
is assigned to point at a buffer with a size that is not a multiple
of sizeof(*pointer).

See @url{https://cwe.mitre.org/data/definitions/131.html, CWE-131: Incorrect 
Calculation of Buffer Size}.


[...snip...]

> diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c 
> b/gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c
> new file mode 100644
> index 00000000000..02634ae883b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c
> @@ -0,0 +1,102 @@
> +#include <stdlib.h>
> +#include <stdio.h>
> +
> +/* Tests with constant buffer sizes.  */
> +
> +void test_1 (void)
> +{
> +  short *ptr = malloc (21 * sizeof (short));
> +  free (ptr);
> +}
> +
> +void test_2 (void)
> +{
> +  int *ptr = malloc (21 * sizeof (short)); /* { dg-line malloc2 } */
> +  free (ptr);
> +
> +  /* { dg-warning "" "" { target *-*-* } malloc2 } */
> +  /* { dg-message "" "" { target *-*-* } malloc2 } */

The various dg-warning and dg-message directives here (and throughout
the rest of the patch) shouldn't have just "" "" for their first two
args.

The first arg should be a regexp that matches some (nonempty) subset of
the expected text.  There's a balance to be struck between:
(a) terseness to avoid "gold plating" the test output (where making any
change to the wording would involve lots of tedious updates to test
directives)
versus
(b) giving us test coverage that the message is sane, so that if we
accidentally break the wording due to future changes to the analyzer,
then at least one test starts failing

Probably best for most of these regexps to be terse, but an empty
regexp is too terse.

The 2nd arg helps us disambiguate with directive we're talking about,
so can be "warning" and "note" for the two above.

[...snip...]

> +void test_5 (void)
> +{
> +  int user_input;
> +  scanf("%i", &user_input);
> +  int n;
> +  if (user_input == 0)
> +    n = 21 * sizeof (short);
> +  else
> +    n = 42 * sizeof (short);

I see you've used scanf, presumably to get a symbolic value for the
variable.  If so, a simpler way to do this is to simply use a parameter
to the test function.  But there's no need to change these test cases.

Perhaps scanf should taint its arguments, which is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106021 but obviously that
would be a different patch.

[...snip...]

> +void test_9(void) 
> +{
> +  // FIXME

Please make this comment more descriptive about what the issue here is.

> +  int *buf = create_buffer(42); /* { dg-warning "" "" { xfail *-*-*
} } */
> +  free (buf);
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c
b/gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c
> new file mode 100644
> index 00000000000..cb35a9d717b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c

[...snip...]

> +void test_8(void) 
> +{
> +  int n;
> +  scanf("%i", &n);
> +  // FIXME

Make the comment more descriptive please.

> +  int *buf = create_buffer(n * sizeof(short)); /* { dg-warning "" ""
{ xfail *-*-* } } */
> +  free (buf);
> +}
> +
> +void test_9 (void)
> +{
> +  int n;
> +  scanf("%i", &n);
> +  /* n is a conjured_svalue.  */
> +  void *ptr = malloc (n); /* { dg-message } */
> +  int *iptr = (int *)ptr; /* { dg-line assign9 } */
> +  free (iptr);
> +
> +  /* { dg-warning "" "" { target *-*-* } assign9 } */
> +  /* { dg-message "" "" { target *-*-* } assign9 } */
> +}
> +
> +void test_11 (void)
> +{
> +  int n;
> +  scanf("%i", &n);
> +  void *ptr = malloc (n);
> +  if (n == 4)

Presumably this should be a test against sizeof (int), rather than 4?

Please add a testcase where the comparison is against the wrong
constant.

> +    {
> +      /* n is a conjured_svalue but guarded.  */
> +      int *iptr = (int *)ptr;
> +      free (iptr);
> +    }
> +  else
> +    free (ptr);
> +}

[...snip...]

> diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-5.c
b/gcc/testsuite/gcc.dg/analyzer/allocation-size-5.c
> new file mode 100644
> index 00000000000..afb1782e0cd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-5.c
> @@ -0,0 +1,40 @@
> +#include <stdlib.h>
> +#include <stdio.h>
> +
> +/* Tests related to structs.  */

Looks like this was copied-and-pasted, and should be updated to "Tests
of statically-sized buffers" or somesuch.

> +
> +typedef struct a {
> +  short s;
> +} a;
> +
> +int *test_1 (void)
> +{
> +  a A;
> +  A.s = 1;
> +  int *ptr = (int *) &A; /* { dg-line assign1 } */
> +  return ptr;
> +
> +  /* { dg-warning "" "" { target *-*-* } assign1 } */
> +  /* { dg-message "" "" { target *-*-* } assign1 } */
> +}
> +
> +int *test2 (void)
> +{
> +  char arr[4];

I think this needs to be sizeof(int), rather than 4.

> +  int *ptr = (int *)arr;
> +  return ptr;
> +}
> +
> +int *test3 (void)
> +{
> +  char arr[2];
> +  int *ptr = (int *)arr; /* { dg-line assign3 } */
> +  return ptr;
> +
> +  /* { dg-warning "" "" { target *-*-* } assign3 } */
> +  /* { dg-message "" "" { target *-*-* } assign3 } */
> +}
> +
> +int main() {
> +  return 0;
> +}

[...snip...]

Thanks again for the v2 patch; hope the above makes sense
Dave

Reply via email to