Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

2017-01-20 Thread Jeff Law

On 01/20/2017 04:34 PM, Jakub Jelinek wrote:

On Fri, Jan 20, 2017 at 04:32:19PM -0700, Jeff Law wrote:

then the loop does the same thing as will memset (p, 6, 3U * 1024 * 1024 * 
1024);
do.  On such large objects some operations may not work properly, e.g.
[i] - [0] might be negative etc., but that is not something the above
loop does or memset will do internally.  If the loop doesn't use just 3/4 of
the address space, but much more, e.g. more than whole address space minus
one page, which is what happens in the testcase, it is indeed quite sure it
will crash if invoked, but the problem with the warning is the same with
many other late warnings or warnings excessively using VRP etc.

Not in my mind, it's different.  It's not triggered by path isolation. It's
standard const propagation + simplification.


So where does the constant -1 length appear there?  The test clearly just
attempts to clear some variable length - 1.  I admit I haven't looked at the
dumps in detail, I should...
At least in Martin's simplified test it's just a series of standard 
constant propagations and obvious simplifications.  No threading, no 
path isolation.


;;   basic block 2, loop depth 0, count 0, freq 1, maybe hot
;;prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED)
;;pred:   ENTRY [100.0%]  (FALLTHRU,EXECUTABLE)
  _7 = MEM[(int * *)s_5(D)];
  _8 = MEM[(int * *)s_5(D) + 8B];
  _9 = (long int) _8;
  _10 = (long int) _7;
  _11 = _9 - _10;
  _12 = _11 /[ex] 4;
  _13 = (long unsigned int) _12;
  _1 = _13 + 18446744073709551614;
  if (_1 <= 2)
goto ; [36.64%]
  else
goto ; [63.36%]
;;succ:   3 [36.6%]  (TRUE_VALUE,EXECUTABLE)
;;8 [63.4%]  (FALSE_VALUE,EXECUTABLE)

;;   basic block 3, loop depth 0, count 0, freq 3664, maybe hot
;;prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED)
;;pred:   2 [36.6%]  (TRUE_VALUE,EXECUTABLE)
  _2 = _13 + 18446744073709551615;
  _14 = MEM[(int * *)s_5(D)];
  _15 = MEM[(int * *)s_5(D) + 8B];
  _16 = (long int) _15;
  _17 = (long int) _14;
  _18 = _16 - _17;
  _19 = _18 /[ex] 4;
  _20 = (long unsigned int) _19;
  if (_2 > _20)
goto ; [50.00%]
  else
goto ; [50.00%]
;;succ:   4 [50.0%]  (TRUE_VALUE,EXECUTABLE)
;;6 [50.0%]  (FALSE_VALUE,EXECUTABLE)

;;   basic block 4, loop depth 0, count 0, freq 1832, maybe hot
;;prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED)
;;pred:   3 [50.0%]  (TRUE_VALUE,EXECUTABLE)
  _21 = _2 - _20;
  _22 = MEM[(int * *)s_5(D) + 16B];
  _23 = (long int) _22;
  _24 = _23 - _16;
  _25 = _24 /[ex] 4;
  left_26 = (size_t) _25;
  if (_21 <= left_26)
goto ; [33.00%]
  else
goto ; [67.00%]
;;succ:   5 [33.0%]  (TRUE_VALUE,EXECUTABLE)
;;8 [67.0%]  (FALSE_VALUE,EXECUTABLE)

;;   basic block 5, loop depth 0, count 0, freq 605, maybe hot
;;prev block 4, next block 6, flags: (NEW, REACHABLE, VISITED)
;;pred:   4 [33.0%]  (TRUE_VALUE,EXECUTABLE)
  _27 = _21 * 4;
  __builtin_memset (_22, 0, _27);
  goto ; [100.00%]
;;succ:   8 [100.0%]  (FALLTHRU,EXECUTABLE)


In particular look at _27, which is _21 * 4.

_21 is _2 - _20

If you follow things though the use-def chains and simplify you'll see 
that _2 - 20 is always -1.



Jeff



Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

2017-01-20 Thread Jakub Jelinek
On Fri, Jan 20, 2017 at 04:32:19PM -0700, Jeff Law wrote:
> > then the loop does the same thing as will memset (p, 6, 3U * 1024 * 1024 * 
> > 1024);
> > do.  On such large objects some operations may not work properly, e.g.
> > [i] - [0] might be negative etc., but that is not something the above
> > loop does or memset will do internally.  If the loop doesn't use just 3/4 of
> > the address space, but much more, e.g. more than whole address space minus
> > one page, which is what happens in the testcase, it is indeed quite sure it
> > will crash if invoked, but the problem with the warning is the same with
> > many other late warnings or warnings excessively using VRP etc.
> Not in my mind, it's different.  It's not triggered by path isolation. It's
> standard const propagation + simplification.

So where does the constant -1 length appear there?  The test clearly just
attempts to clear some variable length - 1.  I admit I haven't looked at the
dumps in detail, I should...

Jakub


Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

2017-01-20 Thread Jeff Law

On 01/18/2017 01:10 AM, Jakub Jelinek wrote:

On Tue, Jan 17, 2017 at 10:59:43PM -0700, Jeff Law wrote:

I agree that breaking those applications would be bad.  It could
be dealt with by adding an option to let them disable the insertion
of the trap.  With the warning, programmers would get a heads up
that their (already dubious) code won't work otherwise.  I don't
think it's a necessary or wise to have the default mode be the most
permissive (and most dangerous) and expect users to tweak options
to make it safe.  Rather, I would argue that it should be the other
way around.  Make the default safe and strict and let the advanced
users who know how deal with the risks tweak those options.

I still come back to the assertion that changing this loop to a mem* is
fundamentally the wrong thing to do as it changes something that has well
defined semantics to something that is invalid.

Thus the transformation into a mem* call is invalid.


The mem* call is as valid as the loop, it will work exactly the same.
If you have say on 32-bit target
#include 

int
main ()
{
  char *p = malloc (3U * 1024 * 1024 * 1024);
  if (p == NULL)
return 0;
  size_t i;
  for (i = 0; i < 3U * 1024 * 1024 * 1024; i++)
p[i] = 6;
  use (p);
  return 0;
}

then the loop does the same thing as will memset (p, 6, 3U * 1024 * 1024 * 
1024);
do.  On such large objects some operations may not work properly, e.g.
[i] - [0] might be negative etc., but that is not something the above
loop does or memset will do internally.  If the loop doesn't use just 3/4 of
the address space, but much more, e.g. more than whole address space minus
one page, which is what happens in the testcase, it is indeed quite sure it
will crash if invoked, but the problem with the warning is the same with
many other late warnings or warnings excessively using VRP etc.
Not in my mind, it's different.  It's not triggered by path isolation. 
It's standard const propagation + simplification.


jeff



Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

2017-01-20 Thread Jeff Law

On 01/18/2017 10:45 AM, Martin Sebor wrote:

On 01/18/2017 01:10 AM, Jakub Jelinek wrote:

On Tue, Jan 17, 2017 at 10:59:43PM -0700, Jeff Law wrote:

I agree that breaking those applications would be bad.  It could
be dealt with by adding an option to let them disable the insertion
of the trap.  With the warning, programmers would get a heads up
that their (already dubious) code won't work otherwise.  I don't
think it's a necessary or wise to have the default mode be the most
permissive (and most dangerous) and expect users to tweak options
to make it safe.  Rather, I would argue that it should be the other
way around.  Make the default safe and strict and let the advanced
users who know how deal with the risks tweak those options.

I still come back to the assertion that changing this loop to a mem* is
fundamentally the wrong thing to do as it changes something that has
well
defined semantics to something that is invalid.

Thus the transformation into a mem* call is invalid.


The mem* call is as valid as the loop, it will work exactly the same.


The problem here isn't the loop itself or the memset call but the
bounds computed from the inputs.

Sorry.  My bad.  Only memcpy has the SIZE_MAX / 2 restriction.

[ ... ]
Going to work from the self-contained test...




Here's a test case that's closer to the one from the bug.  It also
ends up with the out of bounds memset even at -O1, during PRE.

typedef __SIZE_TYPE__ size_t;

struct S
  int *p0, *p1, *p2;

  size_t size () const { return p1 - p0; }

  void f (size_t n) {
if (n > size ())   // can't happen because
  foo (n - size ());   //   n is in [1, MIN(size() - 1, 3)]
else if (n < size ())
  bar (p0 + n);
  }

  void foo (size_t n)
  {
size_t left = (size_t)(p2 - p1);
if (left >= n)
  __builtin_memset (p2, 0, n * sizeof *p2);
  }

  void bar (int*);
};

void f (S )
{
  size_t n = s.size ();
  if (n > 1 && n < 5)
s.f (n - 1);
}



Looking at the .vrp1 dump for this test we find:

;;   basic block 3, loop depth 0, count 0, freq 3664, maybe hot
;;prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED)
;;pred:   2 [36.6%]  (TRUE_VALUE,EXECUTABLE)
  _18 = ASSERT_EXPR <_13, _13 + 18446744073709551614 <= 2>;
  _2 = _18 + 18446744073709551615;
  if (_2 > _18)
goto ; [50.00%]
  else
goto ; [50.00%]

_2 > _18 is the same as

(_18 - 1) > _18

Which is only true if _18 is zero.  And since _18 has a computed range 
of [2..4], that can never happen.


VRP doesn't currently handle this case, though we do have some code to 
turn relational comparisons into equality comparisons.  If I manually 
turn the relational into _18 == 0 at that point, then the next VRP pass 
is able to optimize the conditional away which in turn eliminates the 
problematical memset call & warning.


So one possible approach would be to treat simplify_cond_using_ranges to 
simplify that kind of condition.  I don't know if that would fix the 
reported testcase as I haven't looked at it directly under the debugger yet.




Jeff




Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

2017-01-18 Thread Martin Sebor

On 01/18/2017 01:10 AM, Jakub Jelinek wrote:

On Tue, Jan 17, 2017 at 10:59:43PM -0700, Jeff Law wrote:

I agree that breaking those applications would be bad.  It could
be dealt with by adding an option to let them disable the insertion
of the trap.  With the warning, programmers would get a heads up
that their (already dubious) code won't work otherwise.  I don't
think it's a necessary or wise to have the default mode be the most
permissive (and most dangerous) and expect users to tweak options
to make it safe.  Rather, I would argue that it should be the other
way around.  Make the default safe and strict and let the advanced
users who know how deal with the risks tweak those options.

I still come back to the assertion that changing this loop to a mem* is
fundamentally the wrong thing to do as it changes something that has well
defined semantics to something that is invalid.

Thus the transformation into a mem* call is invalid.


The mem* call is as valid as the loop, it will work exactly the same.


The problem here isn't the loop itself or the memset call but the
bounds computed from the inputs.

In the test case submitted with the bug the inputs supplied by
the program are well within a valid range, yet the bounds of
the loop that are computed by GCC from those inputs are excessive
because they are based on impossible ranges (or an incomplete view
of the program).

There is no unbounded loop in

  void f(std::vector )
  {
size_t n = v.size ();

if (n > 1 && n < 5)
  v.resize (n - 1);
  }

yet GCC synthesizes one:

  void f(std::vector&) (struct vector & v)
  {
...
 [100.00%]:
_7 = MEM[(int * *)v_5(D)]; // v._M_impl._M_start
_8 = MEM[(int * *)v_5(D) + 8B];// v._M_impl._M_finish
_9 = (long int) _8;
_10 = (long int) _7;
_11 = _9 - _10;
_12 = _11 /[ex] 4;
_13 = (long unsigned int) _12; // v.size()
_1 = _13 + 18446744073709551614;   // v.size() - 4
if (_1 <= 2)   // if (v.size() <= 6)
  goto ; [36.64%]
else
  goto ; [63.36%]

 [36.64%]:
_2 = _13 + 18446744073709551615;   // v.size() - 1
if (_2 > _13)  // if (v.size() == 0)
  goto ; [29.56%]//   this can't happen
else
  goto ; [70.44%]

 [5.85%]:
_24 = v_5(D)->D.15828._M_impl._M_end_of_storage;
_25 = (long int) _24;
_28 = _25 - _9;// bytes left
if (_28 == -4) // this can't happen
  goto ; [67.00%]
else
  goto ; [33.00%]

 [3.92%]:
__builtin_memset (_8, 0, 18446744073709551612);   // (size_t)-4
...

 [1.93%]:
std::__throw_length_error ("vector::_M_default_append");


If you have say on 32-bit target
#include 

int
main ()
{
  char *p = malloc (3U * 1024 * 1024 * 1024);
  if (p == NULL)
return 0;
  size_t i;
  for (i = 0; i < 3U * 1024 * 1024 * 1024; i++)
p[i] = 6;
  use (p);
  return 0;
}


Unlike in the test case submitted with the bug, here the loop is in
the source and warning on it would be justified for most programs.

Here's a test case that's closer to the one from the bug.  It also
ends up with the out of bounds memset even at -O1, during PRE.

typedef __SIZE_TYPE__ size_t;

struct S
  int *p0, *p1, *p2;

  size_t size () const { return p1 - p0; }

  void f (size_t n) {
if (n > size ())   // can't happen because
  foo (n - size ());   //   n is in [1, MIN(size() - 1, 3)]
else if (n < size ())
  bar (p0 + n);
  }

  void foo (size_t n)
  {
size_t left = (size_t)(p2 - p1);
if (left >= n)
  __builtin_memset (p2, 0, n * sizeof *p2);
  }

  void bar (int*);
};

void f (S )
{
  size_t n = s.size ();
  if (n > 1 && n < 5)
s.f (n - 1);
}

Martin



Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

2017-01-18 Thread Jakub Jelinek
On Tue, Jan 17, 2017 at 10:59:43PM -0700, Jeff Law wrote:
> > I agree that breaking those applications would be bad.  It could
> > be dealt with by adding an option to let them disable the insertion
> > of the trap.  With the warning, programmers would get a heads up
> > that their (already dubious) code won't work otherwise.  I don't
> > think it's a necessary or wise to have the default mode be the most
> > permissive (and most dangerous) and expect users to tweak options
> > to make it safe.  Rather, I would argue that it should be the other
> > way around.  Make the default safe and strict and let the advanced
> > users who know how deal with the risks tweak those options.
> I still come back to the assertion that changing this loop to a mem* is
> fundamentally the wrong thing to do as it changes something that has well
> defined semantics to something that is invalid.
> 
> Thus the transformation into a mem* call is invalid.

The mem* call is as valid as the loop, it will work exactly the same.
If you have say on 32-bit target
#include 

int
main ()
{
  char *p = malloc (3U * 1024 * 1024 * 1024);
  if (p == NULL)
return 0;
  size_t i;
  for (i = 0; i < 3U * 1024 * 1024 * 1024; i++)
p[i] = 6;
  use (p);
  return 0;
}

then the loop does the same thing as will memset (p, 6, 3U * 1024 * 1024 * 
1024);
do.  On such large objects some operations may not work properly, e.g.
[i] - [0] might be negative etc., but that is not something the above
loop does or memset will do internally.  If the loop doesn't use just 3/4 of
the address space, but much more, e.g. more than whole address space minus
one page, which is what happens in the testcase, it is indeed quite sure it
will crash if invoked, but the problem with the warning is the same with
many other late warnings or warnings excessively using VRP etc.
If we are given
void
foo (char *p, size_t len)
{
  memset (p, '\0', len);
}
then it also can invoke UB if somebody calls it with a huge len
(~(size_t)0), but we rightly don't want to warn because we can't prove it
will be called with that.  If some unrelated or related test somewhere else
makes the value range smaller on some path or turns it constant on some
path, the warning does warn, even when there is a sometimes low, sometimes
very high possibility the warning is on dead code.
Which is why IMHO it is a very bad idea to enable this warning by default
even without any -W* option, it just has too many false positives.
Turning extra large memcpy (as memcpy can't have overlap in between the
objects, already starting from half of address space is fine; for memset
I'd think only for something like -4096 or something similar that is just
unlikely in hosted environment to be a single object) into a trap or
unreachable might be a nice optimization and allow optimizing away code that
would follow the spot that would always crash.  But with warning above it
the problem is that it is hard to figure out if it is reachable or not.

Jakub


Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

2017-01-17 Thread Jeff Law

On 01/17/2017 08:16 PM, Martin Sebor wrote:

On 01/17/2017 12:38 AM, Jakub Jelinek wrote:

On Mon, Jan 16, 2017 at 05:06:40PM -0700, Martin Sebor wrote:

The test case submitted in bug 79095 - [7 regression] spurious
stringop-overflow warning shows that GCC optimizes some loops
into calls to memset with size arguments in excess of the object
size limit.  Since such calls will unavoidably lead to a buffer
overflow and memory corruption the attached patch detects them
and replaces them with a trap.  That both prevents the buffer
overflow and eliminates the warning.


I fear this is going to break various 32-bit database programs and
similar
that mmap say 3GB of RAM and then work on that memory chunk as
contiguous.
Some things don't work too well in that case (pointer differences),
but it
is unlikely they would be using those, while your patch actively
breaks it
even for loops that can be transformed into memset (memcpy of course
isn't a
problem, because you need some virtual address space to copy it from).


I agree that breaking those applications would be bad.  It could
be dealt with by adding an option to let them disable the insertion
of the trap.  With the warning, programmers would get a heads up
that their (already dubious) code won't work otherwise.  I don't
think it's a necessary or wise to have the default mode be the most
permissive (and most dangerous) and expect users to tweak options
to make it safe.  Rather, I would argue that it should be the other
way around.  Make the default safe and strict and let the advanced
users who know how deal with the risks tweak those options.
I still come back to the assertion that changing this loop to a mem* is 
fundamentally the wrong thing to do as it changes something that has 
well defined semantics to something that is invalid.


Thus the transformation into a mem* call is invalid.
jeff




Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

2017-01-17 Thread Martin Sebor

On 01/17/2017 12:38 AM, Jakub Jelinek wrote:

On Mon, Jan 16, 2017 at 05:06:40PM -0700, Martin Sebor wrote:

The test case submitted in bug 79095 - [7 regression] spurious
stringop-overflow warning shows that GCC optimizes some loops
into calls to memset with size arguments in excess of the object
size limit.  Since such calls will unavoidably lead to a buffer
overflow and memory corruption the attached patch detects them
and replaces them with a trap.  That both prevents the buffer
overflow and eliminates the warning.


I fear this is going to break various 32-bit database programs and similar
that mmap say 3GB of RAM and then work on that memory chunk as contiguous.
Some things don't work too well in that case (pointer differences), but it
is unlikely they would be using those, while your patch actively breaks it
even for loops that can be transformed into memset (memcpy of course isn't a
problem, because you need some virtual address space to copy it from).


I agree that breaking those applications would be bad.  It could
be dealt with by adding an option to let them disable the insertion
of the trap.  With the warning, programmers would get a heads up
that their (already dubious) code won't work otherwise.  I don't
think it's a necessary or wise to have the default mode be the most
permissive (and most dangerous) and expect users to tweak options
to make it safe.  Rather, I would argue that it should be the other
way around.  Make the default safe and strict and let the advanced
users who know how deal with the risks tweak those options.

Martin


Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

2017-01-17 Thread Martin Sebor

On 01/17/2017 10:57 AM, Jeff Law wrote:

On 01/17/2017 09:12 AM, Martin Sebor wrote:

On 01/17/2017 08:26 AM, Jeff Law wrote:

On 01/16/2017 05:06 PM, Martin Sebor wrote:

The test case submitted in bug 79095 - [7 regression] spurious
stringop-overflow warning shows that GCC optimizes some loops
into calls to memset with size arguments in excess of the object
size limit.  Since such calls will unavoidably lead to a buffer
overflow and memory corruption the attached patch detects them
and replaces them with a trap.  That both prevents the buffer
overflow and eliminates the warning.

But doesn't the creation of the bogus memset signal an invalid
transformation in the loop optimizer?  ie, if we're going to convert a
loop into a memset, then we'd damn well better be sure the loop bounds
are reasonable.


I'm not sure that emitting the memset call is necessarily a bug in
the loop optimizer (which in all likelihood wasn't written with
the goal of preventing or detecting possible buffer overflows).
The loop with the excessive bound is in the source code and can
be reached given the right inputs (calling v.resize(v.size() - 1)
on an empty vector.  It's a lurking bug in the program that, if
triggered, will overflow the vector and crash the program (or worse)
with or without the optimization.

Right, but that doesn't mean that the loop optimizer can turn it into a
memset.  If the bounds are such that we're going to invoke undefined
behaviour from memset, then the loop optimizer must leave the loop alone.



What else could the loop optimizer could do in this instance?
I suppose it could just leave the loop alone and avoid emitting
the memset call.  That would avoid the warning but mask the
problem with the overflow.  In my mind, preventing the overflow
given that we have the opportunity is the right thing to do.
That is, after all, the goal of the warning.

The right warning in this case is WRT the loop iteration space
independent of mem*.


I agree that warning for the user code would be appropriate if
the loop with the excessive bound were unavoidable or at least
reachable.  But in the submitted test case it isn't because
the call to vector::resize() is guarded.  In fact, the out of-
bounds memset is also emitted with this modified test case:

  void f (std::vector )
  {
size_t n = v.size ();

if (n > 1 && n < 5)
  v.resize (n - 1);
  }

I believe the root cause of the the out-of-bounds memset is the
lack of support for pointer ranges or at least some notion of
their relationships and constraints.  A std::vector is defined
by three pointers that satisfy the following relation:

  start <= finish <= end_of_storage

Queries about the capacity and size of a vector are done in terms
of expressions involving these three pointers:

  size_type capacity () {
return end_of_storage - start;
  }

  size_type size () {
return finish - start;
  }

Space remaining is hand-coded as

  end_of_storage - finish

The trouble is that GCC has no idea about the constraints on
the three pointers and so it treats them essentially as unrelated
integers with no ranges (except for their scale that depends on
the type the pointer points to).

Absent support for pointer ranges, GCC would be able to generate
much better code with just some help from annotations telling it
about at least some of the constraints.  In this case, for example,
adding the following optimistic invariant:

   size_type space_left = end_of_storage - finish;

   if (space_left > SIZE_MAX / 2 / sizeof (T))
 __builtin_unreachable ();

to vector::_M_default_append() lets GCC avoid the memset and emit
significantly more efficient code.  On x86_64 it reduces the number
instructions for the test case by 40%.

Martin


Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

2017-01-17 Thread Jeff Law

On 01/17/2017 09:12 AM, Martin Sebor wrote:

On 01/17/2017 08:26 AM, Jeff Law wrote:

On 01/16/2017 05:06 PM, Martin Sebor wrote:

The test case submitted in bug 79095 - [7 regression] spurious
stringop-overflow warning shows that GCC optimizes some loops
into calls to memset with size arguments in excess of the object
size limit.  Since such calls will unavoidably lead to a buffer
overflow and memory corruption the attached patch detects them
and replaces them with a trap.  That both prevents the buffer
overflow and eliminates the warning.

But doesn't the creation of the bogus memset signal an invalid
transformation in the loop optimizer?  ie, if we're going to convert a
loop into a memset, then we'd damn well better be sure the loop bounds
are reasonable.


I'm not sure that emitting the memset call is necessarily a bug in
the loop optimizer (which in all likelihood wasn't written with
the goal of preventing or detecting possible buffer overflows).
The loop with the excessive bound is in the source code and can
be reached given the right inputs (calling v.resize(v.size() - 1)
on an empty vector.  It's a lurking bug in the program that, if
triggered, will overflow the vector and crash the program (or worse)
with or without the optimization.
Right, but that doesn't mean that the loop optimizer can turn it into a 
memset.  If the bounds are such that we're going to invoke undefined 
behaviour from memset, then the loop optimizer must leave the loop alone.




What else could the loop optimizer could do in this instance?
I suppose it could just leave the loop alone and avoid emitting
the memset call.  That would avoid the warning but mask the
problem with the overflow.  In my mind, preventing the overflow
given that we have the opportunity is the right thing to do.
That is, after all, the goal of the warning.
The right warning in this case is WRT the loop iteration space 
independent of mem*.





As I mentioned privately yesterday, I'm actually pleasantly
surprised that it's helped identify this opportunity in GCC itself.
My hope was to eventually go and find the places where GCC emits
potentially out of bounds calls (based on user inputs) and fix them
to emit better code on the assumption that they can't be valid or
replace them with traps if they could happen in a running program.
It didn't occur to me that the warning itself would help find them.

Martin





Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

2017-01-17 Thread Martin Sebor

On 01/17/2017 08:26 AM, Jeff Law wrote:

On 01/16/2017 05:06 PM, Martin Sebor wrote:

The test case submitted in bug 79095 - [7 regression] spurious
stringop-overflow warning shows that GCC optimizes some loops
into calls to memset with size arguments in excess of the object
size limit.  Since such calls will unavoidably lead to a buffer
overflow and memory corruption the attached patch detects them
and replaces them with a trap.  That both prevents the buffer
overflow and eliminates the warning.

But doesn't the creation of the bogus memset signal an invalid
transformation in the loop optimizer?  ie, if we're going to convert a
loop into a memset, then we'd damn well better be sure the loop bounds
are reasonable.


I'm not sure that emitting the memset call is necessarily a bug in
the loop optimizer (which in all likelihood wasn't written with
the goal of preventing or detecting possible buffer overflows).
The loop with the excessive bound is in the source code and can
be reached given the right inputs (calling v.resize(v.size() - 1)
on an empty vector.  It's a lurking bug in the program that, if
triggered, will overflow the vector and crash the program (or worse)
with or without the optimization.

What else could the loop optimizer could do in this instance?
I suppose it could just leave the loop alone and avoid emitting
the memset call.  That would avoid the warning but mask the
problem with the overflow.  In my mind, preventing the overflow
given that we have the opportunity is the right thing to do.
That is, after all, the goal of the warning.

As I mentioned privately yesterday, I'm actually pleasantly
surprised that it's helped identify this opportunity in GCC itself.
My hope was to eventually go and find the places where GCC emits
potentially out of bounds calls (based on user inputs) and fix them
to emit better code on the assumption that they can't be valid or
replace them with traps if they could happen in a running program.
It didn't occur to me that the warning itself would help find them.

Martin



Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

2017-01-17 Thread Jeff Law

On 01/16/2017 05:06 PM, Martin Sebor wrote:

The test case submitted in bug 79095 - [7 regression] spurious
stringop-overflow warning shows that GCC optimizes some loops
into calls to memset with size arguments in excess of the object
size limit.  Since such calls will unavoidably lead to a buffer
overflow and memory corruption the attached patch detects them
and replaces them with a trap.  That both prevents the buffer
overflow and eliminates the warning.
But doesn't the creation of the bogus memset signal an invalid 
transformation in the loop optimizer?  ie, if we're going to convert a 
loop into a memset, then we'd damn well better be sure the loop bounds 
are reasonable.


Jeff


Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

2017-01-16 Thread Jakub Jelinek
On Mon, Jan 16, 2017 at 05:06:40PM -0700, Martin Sebor wrote:
> The test case submitted in bug 79095 - [7 regression] spurious
> stringop-overflow warning shows that GCC optimizes some loops
> into calls to memset with size arguments in excess of the object
> size limit.  Since such calls will unavoidably lead to a buffer
> overflow and memory corruption the attached patch detects them
> and replaces them with a trap.  That both prevents the buffer
> overflow and eliminates the warning.

I fear this is going to break various 32-bit database programs and similar
that mmap say 3GB of RAM and then work on that memory chunk as contiguous.
Some things don't work too well in that case (pointer differences), but it
is unlikely they would be using those, while your patch actively breaks it
even for loops that can be transformed into memset (memcpy of course isn't a
problem, because you need some virtual address space to copy it from).

And as written in the PR, IMNSHO the warning should not be enabled by
default at its current verboseness and false positive rate.

Jakub