On 12/15/2017 09:17 AM, Richard Biener wrote:
On December 15, 2017 4:58:14 PM GMT+01:00, Martin Sebor <mse...@gmail.com> 
wrote:
On 12/15/2017 01:48 AM, Richard Biener wrote:
On Thu, Dec 14, 2017 at 5:01 PM, Martin Sebor <mse...@gmail.com>
wrote:
On 12/14/2017 03:43 AM, Richard Biener wrote:

On Wed, Dec 13, 2017 at 4:47 AM, Martin Sebor <mse...@gmail.com>
wrote:

On 12/12/2017 05:35 PM, Jeff Law wrote:


On 12/12/2017 01:15 PM, Martin Sebor wrote:


Bug 83373 - False positive reported by -Wstringop-overflow, is
another example of warning triggered by a missed optimization
opportunity, this time in the strlen pass.  The optimization
is discussed in pr78450 - strlen(s) return value can be assumed
to be less than the size of s.  The gist of it is that the
result
of strlen(array) can be assumed to be less than the size of
the array (except in the corner case of last struct members).

To avoid the false positive the attached patch adds this
optimization to the strlen pass.  Although the patch passes
bootstrap and regression tests for all front-ends I'm not sure
the way it determines the upper bound of the range is 100%
correct for languages with arrays with a non-zero lower bound.
Maybe it's just not as tight as it could be.


What about something hideous like

struct fu {
  char x1[10];
  char x2[10];
  int avoid_trailing_array;
}

Where objects stored in x1 are not null terminated.  Are we in
the realm
of undefined behavior at that point (I hope so)?



Yes, this is undefined.  Pointer arithmetic (either direct or
via standard library functions) is only defined for pointers
to the same object or subobject.  So even something like

 memcpy (pfu->x1, pfu->x1 + 10, 10);

is undefined.


There's nothing undefined here - computing the pointer pointing
to one-after-the-last element of an array is valid (you are just
not allowed to dereference it).


Right, and memcpy dereferences it, so it's undefined.

That's interpretation of the standard that I don't share.

It's not an interpretation.  It's a basic rule of the languages
that the standards are explicit about.  In C11 you will find
this specified in detail in 6.5.6, paragraph 7 and 8 (of
particular relevance to your question below is p7: "a pointer
to an object that is not an element of an array behaves the same
as a pointer to the first element of an array of length one.")

I know.

Also, if I have struct f { int i; int j; };  and a int * that points
to the j member you say I have no standard conforming way
to get at a pointer to the i member from this, right?

Correct.  See above.

Because
the pointer points to an 'int' object.  But it also points within
a struct f object!  So at least maybe (int *)((char *)p - offsetof
(struct f, j))
should be valid?

No, not really.  It works in practice but it's not well-defined.
It doesn't matter how you get at the result.  What matters is
what you start with.  As Jeff said, to derive a pointer to
distinct suobjects of a larger object you need to start with
a pointer to the larger object and treat it as an array of
chars.

That's obviously not constraints people use C and C++ with so I see no way to 
enforce this within gimple.

There's code out there that relies on all sorts of undefined
behavior.  It's a judgment call in each instance as to how much
of such code exists and how important it is.  In this case, I'd
expect it to be confined to low-level software like OS kernels
and such whose authors use C as a more convenient assembly
language to talk directly to the hardware.  Programmers in other
domains are usually more conscious of the requirements and limited
guarantees of the language and less willing to make assumptions
based on what this or that processor lets them get away with.

That being said, it certainly is possible to enforce this
constraint within GIMPLE.  My -Wrestrict patch does it to
an extent for the memory and string built-ins.  The -Warray-bounds
patch I submitted for offsets does it for all other expressions.
Neither patch exposed any such code in the Linux kernel, so it
doesn't look like abuses of this sort are common even in low-level
code.

Martin

Reply via email to