Re: [PATCH] 69517 - [5/6 regression] SEGV on a VLA with excess initializer elements

Martin Sebor Thu, 07 Apr 2016 17:06:04 -0700

I've spent a ton of time trying to implement the suggested
changes (far too much in large part because of my own mistakes)
but I don't think they will work.  I'll try to clean up what
I have and post it for review.  I wanted to respond to this
how in case you have some suggestions or concerns with the
direction I'm taking in the meantime.

But if even a few MB seems too strict, I would find having even
an exceedingly liberal limit (say 1GB) much preferable to none
at all as it makes it possible to exercise boundary conditions
such as the size overflow problem you noted below.


That sounds reasonable, as long as users with unusual needs can adjust
it with a flag, but even so I'm nervous about doing this in stage 4.  It
certainly isn't a regression.


I'm not comfortable adding a new option at this stage.  I'm also
not sure that an option to impose a static limit is the best
solution.  It seems that if we go to the trouble of making the limit
customizable it should be possible to change it without recompiling
everything (e.g., on ELF, we could check for a weak function and
call it to get the most up-to-date limit).

Let me restore the 4.9.3 behavior by setting the VLA size limit to
SIZE_MAX / 2 (that fixes the other regression that I just raised
in c++/70588 for the record).

I don't think modifying build_vec_init() alone would be sufficient.
For example, the function isn't called for a VLA with a constant
bound like this one:

  int A [2][N] = { 1, 2, 3, 4 };


That seems like a bug, due to array_of_runtime_bound_p returning false
for that array.


It seems that a complete fix would involve (among other things)
replacing calls to array_of_runtime_bound_p with
variably_modified_type_p or similar since the N3639 arrays are
just a subset of those accepted by G++.  Unfortunately, that has
other repercussions (e.g., c++70555).

I replaced the call to array_of_runtime_bound_p in build_vec_init
with one to variably_modified_type_p to get around the above.
That  works, but it's only good for checking for excess
initializers in build_vec_init.  It's too late to check for
overflow in the VLA bounds because by that time the code to
allocate the stack has already been emitted.

Also, I think we should check for invalid bounds in
compute_array_index_type, next to the UBsan code.  Checking bounds only
from cp_finish_decl means that we don't check uses of VLA types other
than variable declarations.


I don't see how to make this work.  compute_array_index_type
doesn't have access to the CONSTRUCTOR for the initializer of
the VLA the initializer hasn't been parsed yet).  Without it
it's not possible to detect VLA size overflow in cases such
as in:

    T a [][N] = { { ... }, { ... } };

where the number of top-level elements determines whether or
not the size of the whole VLA would overflow or exceed the
maximum.

Given this, I believe the check does need to be implemented
somewhere in cp_finish_decl or one of the functions it calls
(such as check_initializer) and emitted before build_vec_init
is called or the initializer code it creates is emitted.


You mean VLA typedefs?  That's true, though I have consciously
avoided dealing with those.  They're outlawed in N3639 and so
I've been focusing just on variables.  But since GCC accepts
VLA typedefs too I was thinking I would bring them up at some
point in the future to decide what to do about them.


And cast to pointer to VLAs.  But for non-variable cases we don't care
about available stack, so we wouldn't want your allocation limit to apply.


I don't want to implement it now, but I think the same limit
should apply in all cases, otherwise code remains susceptible
to unsigned integer wrapping.  For example:

  extern size_t N;
  typedef int A [N];
  int *a = (int*)malloc (sizeof (A));   // possible wraparound
  a [N - 1] = 0;                        // out-of-bounds write

It seems that the typedef will need to be accepted (in case it's
unused) but the runtime sizeof would need to do the checking and
potentially throw.  I haven't thought through the ramifications
yet.

As for where to add the bounds checking code, I also at first
thought of checking the bounds parallel to the UBSan code in
compute_array_index_type() and even tried that approach. The
problem with it is that it only considers one array dimension
at a time, without the knowledge of the others.  As a result,
as noted in sanitizer/70051, it doesn't correctly detect
overflows in the bounds of multidimensional VLAs.


It doesn't, but I don't see why it couldn't.  It should be fine to check
each dimension for overflow separately; if an inner dimension doesn't
overflow, we can go on and consider the outer dimension.


As I explained above, I don't see how to make this work.


Incidentally, I was wondering if it would make sense to use the
overflowing calculation for both TYPE_SIZE and the sanity check when
we're doing both.


I'm not sure what you mean here.  Can you elaborate?

+          /* Avoid instrumenting constexpr functions.  Those must
+         be checked statically, and the (non-constexpr) dynamic
+         instrumentation would cause them to be rejected.  */

Hmm, this sounds wrong; constexpr functions can also be called with
non-constant arguments, and the instrumentation should be folded away
when evaluating a call with constant arguments.

You're right that constexpr functions should be checked as
well.  Unfortunately, at present, due to c++/70507 the check
(or rather the call to __builtin_mul_overflow) isn't folded
away and we end up with error: call to internal function.


Ah, sure.  It should be pretty simple to teach the constexpr code how to
handle that built-in.


I'd be glad to do this work but I don't believe I can get it done
in time for 6.0.

Martin

Re: [PATCH] 69517 - [5/6 regression] SEGV on a VLA with excess initializer elements

Reply via email to