Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

Martin Sebor Wed, 18 Jan 2017 09:46:22 -0800

On 01/18/2017 01:10 AM, Jakub Jelinek wrote:

On Tue, Jan 17, 2017 at 10:59:43PM -0700, Jeff Law wrote:

I agree that breaking those applications would be bad.  It could
be dealt with by adding an option to let them disable the insertion
of the trap.  With the warning, programmers would get a heads up
that their (already dubious) code won't work otherwise.  I don't
think it's a necessary or wise to have the default mode be the most
permissive (and most dangerous) and expect users to tweak options
to make it safe.  Rather, I would argue that it should be the other
way around.  Make the default safe and strict and let the advanced
users who know how deal with the risks tweak those options.

I still come back to the assertion that changing this loop to a mem* is
fundamentally the wrong thing to do as it changes something that has well
defined semantics to something that is invalid.


Thus the transformation into a mem* call is invalid.


The mem* call is as valid as the loop, it will work exactly the same.


The problem here isn't the loop itself or the memset call but the
bounds computed from the inputs.

In the test case submitted with the bug the inputs supplied by
the program are well within a valid range, yet the bounds of
the loop that are computed by GCC from those inputs are excessive
because they are based on impossible ranges (or an incomplete view
of the program).

There is no unbounded loop in

  void f(std::vector<int> &v)
  {
    size_t n = v.size ();

    if (n > 1 && n < 5)
      v.resize (n - 1);
  }

yet GCC synthesizes one:

  void f(std::vector<int>&) (struct vector & v)
  {
    ...
    <bb 2> [100.00%]:
    _7 = MEM[(int * *)v_5(D)];             // v._M_impl._M_start
    _8 = MEM[(int * *)v_5(D) + 8B];        // v._M_impl._M_finish
    _9 = (long int) _8;
    _10 = (long int) _7;
    _11 = _9 - _10;
    _12 = _11 /[ex] 4;
    _13 = (long unsigned int) _12;         // v.size()
    _1 = _13 + 18446744073709551614;       // v.size() - 4
    if (_1 <= 2)                           // if (v.size() <= 6)
      goto <bb 3>; [36.64%]
    else
      goto <bb 9>; [63.36%]

    <bb 3> [36.64%]:
    _2 = _13 + 18446744073709551615;       // v.size() - 1
    if (_2 > _13)                          // if (v.size() == 0)
      goto <bb 4>; [29.56%]                //   this can't happen
    else
      goto <bb 7>; [70.44%]

    <bb 4> [5.85%]:
    _24 = v_5(D)->D.15828._M_impl._M_end_of_storage;
    _25 = (long int) _24;
    _28 = _25 - _9;                        // bytes left
    if (_28 == -4)                         // this can't happen
      goto <bb 5>; [67.00%]
    else
      goto <bb 6>; [33.00%]

    <bb 5> [3.92%]:
    __builtin_memset (_8, 0, 18446744073709551612);   // (size_t)-4
    ...

    <bb 6> [1.93%]:
    std::__throw_length_error ("vector::_M_default_append");

If you have say on 32-bit target
#include <stdlib.h>

int
main ()
{
  char *p = malloc (3U * 1024 * 1024 * 1024);
  if (p == NULL)
    return 0;
  size_t i;
  for (i = 0; i < 3U * 1024 * 1024 * 1024; i++)
    p[i] = 6;
  use (p);
  return 0;
}


Unlike in the test case submitted with the bug, here the loop is in
the source and warning on it would be justified for most programs.

Here's a test case that's closer to the one from the bug.  It also
ends up with the out of bounds memset even at -O1, during PRE.

typedef __SIZE_TYPE__ size_t;

struct S
  int *p0, *p1, *p2;

  size_t size () const { return p1 - p0; }

  void f (size_t n) {
    if (n > size ())       // can't happen because
      foo (n - size ());   //   n is in [1, MIN(size() - 1, 3)]
    else if (n < size ())
      bar (p0 + n);
  }

  void foo (size_t n)
  {
    size_t left = (size_t)(p2 - p1);
    if (left >= n)
      __builtin_memset (p2, 0, n * sizeof *p2);
  }

  void bar (int*);
};

void f (S &s)
{
  size_t n = s.size ();
  if (n > 1 && n < 5)
    s.f (n - 1);
}

Martin

Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

Reply via email to