Re: [PATCH] set range for strlen(array) to avoid spurious -Wstringop-overflow (PR 83373 , PR 78450)

Martin Sebor Fri, 15 Dec 2017 07:59:04 -0800

On 12/15/2017 01:48 AM, Richard Biener wrote:

On Thu, Dec 14, 2017 at 5:01 PM, Martin Sebor <mse...@gmail.com> wrote:

On 12/14/2017 03:43 AM, Richard Biener wrote:


On Wed, Dec 13, 2017 at 4:47 AM, Martin Sebor <mse...@gmail.com> wrote:


On 12/12/2017 05:35 PM, Jeff Law wrote:



On 12/12/2017 01:15 PM, Martin Sebor wrote:



Bug 83373 - False positive reported by -Wstringop-overflow, is
another example of warning triggered by a missed optimization
opportunity, this time in the strlen pass.  The optimization
is discussed in pr78450 - strlen(s) return value can be assumed
to be less than the size of s.  The gist of it is that the result
of strlen(array) can be assumed to be less than the size of
the array (except in the corner case of last struct members).

To avoid the false positive the attached patch adds this
optimization to the strlen pass.  Although the patch passes
bootstrap and regression tests for all front-ends I'm not sure
the way it determines the upper bound of the range is 100%
correct for languages with arrays with a non-zero lower bound.
Maybe it's just not as tight as it could be.



What about something hideous like

struct fu {
  char x1[10];
  char x2[10];
  int avoid_trailing_array;
}

Where objects stored in x1 are not null terminated.  Are we in the realm
of undefined behavior at that point (I hope so)?




Yes, this is undefined.  Pointer arithmetic (either direct or
via standard library functions) is only defined for pointers
to the same object or subobject.  So even something like

 memcpy (pfu->x1, pfu->x1 + 10, 10);

is undefined.



There's nothing undefined here - computing the pointer pointing
to one-after-the-last element of an array is valid (you are just
not allowed to dereference it).



Right, and memcpy dereferences it, so it's undefined.


That's interpretation of the standard that I don't share.


It's not an interpretation.  It's a basic rule of the languages
that the standards are explicit about.  In C11 you will find
this specified in detail in 6.5.6, paragraph 7 and 8 (of
particular relevance to your question below is p7: "a pointer
to an object that is not an element of an array behaves the same
as a pointer to the first element of an array of length one.")

Also, if I have struct f { int i; int j; };  and a int * that points
to the j member you say I have no standard conforming way
to get at a pointer to the i member from this, right?


Correct.  See above.

Because
the pointer points to an 'int' object.  But it also points within
a struct f object!  So at least maybe (int *)((char *)p - offsetof
(struct f, j))
should be valid?


No, not really.  It works in practice but it's not well-defined.
It doesn't matter how you get at the result.  What matters is
what you start with.  As Jeff said, to derive a pointer to
distinct suobjects of a larger object you need to start with
a pointer to the larger object and treat it as an array of
chars.

This means that pfu->x1 + 10 is a valid pointer
into *pfu no matter what you say and you can dereference it.


No.

As another hopefully more convincing example consider a multi-
dimensional array A[2][2].  The value of the offset of A[i][j]
is sizeof A[i] + j.  With that, the offset of A[1][0] is
sizeof A[1] + 0, and so would be the offset of A[0][2]. But
that doesn't make A[0][2] a valid reference to an element of
A (because A[0] has only two elements, A[0][0] and A[0][1]),
or &A[0] + 2 a derefernceable pointer.  It's a pointer that
points just past the last element of the array A[0].  That
there's another array right after A[0] (namely A[1]) is
immaterial, same as in the struct f example above.

Martin

Re: [PATCH] set range for strlen(array) to avoid spurious -Wstringop-overflow (PR 83373 , PR 78450)

Reply via email to