On 2/11/21 1:09 AM, Richard Biener wrote:
On Wed, Feb 10, 2021 at 7:03 PM Martin Sebor <mse...@gmail.com> wrote:

On 2/10/21 3:39 AM, Richard Biener wrote:
On Tue, Feb 9, 2021 at 4:37 PM Martin Sebor <mse...@gmail.com> wrote:

On 2/9/21 12:41 AM, Richard Biener wrote:
On Tue, Feb 9, 2021 at 1:04 AM Martin Sebor via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:

On 2/8/21 12:09 PM, Jeff Law wrote:


On 2/3/21 3:45 PM, Martin Sebor wrote:
On 2/3/21 2:57 PM, Jeff Law wrote:


On 1/28/21 4:03 PM, Martin Sebor wrote:
The GCC 11 -Warray-bounds enhancement to diagnose accesses whose
leading offset is in bounds but whose trailing offset is not has
been causing some confusion.  When the warning is issued for
an access to an in-bounds member via a pointer to a struct that's
larger than the pointed-to object, phrasing this strictly invalid
access in terms of array subscripts can be misleading, especially
when the source code doesn't involve any arrays or indexing.

Since the problem boils down to an aliasing violation much more
so than an actual out-of-bounds access, the attached patch adjusts
the code to issue a -Wstrict-aliasing warning in these cases instead
of -Warray-bounds.  In addition, since the aliasing assumptions in
GCC can be disabled by -fno-strict-aliasing, the patch also makes
these instances of the warning conditional on -fstrict-aliasing
being in effect.

Martin

gcc-98503.diff

PR middle-end/98503 -Warray-bounds when -Wstrict-aliasing would be
more appropriate

gcc/ChangeLog:

        PR middle-end/98503
        * gimple-array-bounds.cc (array_bounds_checker::check_mem_ref):
        Issue -Wstrict-aliasing for a subset of violations.
        (array_bounds_checker::check_array_bounds):  Set new member.
        * gimple-array-bounds.h (array_bounds_checker::cref_of_mref): New
        data member.

gcc/testsuite/ChangeLog:

        PR middle-end/98503
        * g++.dg/warn/Warray-bounds-10.C: Adjust text of expected warnings.
        * g++.dg/warn/Warray-bounds-11.C: Same.
        * g++.dg/warn/Warray-bounds-12.C: Same.
        * g++.dg/warn/Warray-bounds-13.C: Same.
        * gcc.dg/Warray-bounds-63.c: Avoid -Wstrict-aliasing.  Adjust text
        of expected warnings.
        * gcc.dg/Warray-bounds-66.c: Adjust text of expected warnings.
        * gcc.dg/Wstrict-aliasing-2.c: New test.
        * gcc.dg/Wstrict-aliasing-3.c: New test.
What I don't like here is the strict-aliasing warnings inside the file
and analysis phase for array bounds checking.

ISTM that catching this at cast time would be better.  So perhaps in
build_c_cast and the C++ equivalent?

It would mean we're warning at the cast site rather than the
dereference, but that may ultimately be better for the warning anyway.
I'm not sure.

I had actually experimented with a this (in the middle end, and only
for accesses) but even that caused so many warnings that I abandoned
it, at least for now.  -Warray-bounds has the benefit of flow analysis
(and dead code elimination).  In the front end it would have neither
and be both excessively noisy and ineffective at the same time.  For
example:
I think we agree that this really is an aliasing issue and I would go
further and claim that it has nothing to do with array bounds checking.
Therefore warning for it within gimple-array-bounds just seems wrong.

I realize that you like having DCE run and the ability to look at
offsets and such, but it really feels like the wrong place to do this.
Aliasing issues are also more of a front-end thing since the language
defines what is and what is not valid aliasing -- one might even argue
that any aliasing warning needs to be identified by the front-ends
exclusively (though possibly pruned away later).

The aliasing restrictions are on accesses, which are [defined in
C as] execution-time actions on objects.  Analyzing execution-time
actions unavoidably depends on flow analysis which the front ends
have only very limited support for (simple expressions only).
I gave one example showing how the current -Wstrict-aliasing in
the front end is ineffective against all but the most simplistic
bugs, but there are many more.  For instance:

      int i;
      void *p = &i;    // valid
      float *q = p;    // valid
      *q = 0;          // aliasing violation

This bug is easily detectable in the middle end but impossible
to do in the front end (same as all other invalid accesses).

But the code is valid in GIMPLE which allows to re-use the 'int i' storage
with storing using a different type.

Presumably you're referring to using placement new?

No, I'm refering to the rules the GIMPLE IL follows.  GIMPLE is no
longer C or C++ and thus what is invalid in C or C++ doesn't
necessarily have to be invalid in GIMPLE.

   The warning
would have to be aware of it and either run before placement new
is inlined.  Alternatively, the inlining could add some annotation
into the IL that the warning would then use to differentiate valid
code from invalid.

At least in old versions of the C++ standard a simple "re-use" of
storage starts lifetime of a new object (for certain classes of types,
'int' for example), so no placement new is needed here.

I'm not familiar with this rule.  Can you point me to the section
of the C++ that describes it?

For example C++14 3.8 Object lifetime.  I can read 1) as

   int i; // lifetime starts - storage is obtained
   i = 1;
   foo (i);
   float *p = (float *)&i; // lifetime of *p starts - storage is obtained

At this point p is just a pointer that points to the int i.
The only new object here is the pointer p.

   *p = 1.; // lifetime of i ends, storage is re-used

I don't think that's the intended reading.  The storage of an object
can be reused by constructing another object in it, and the only way
to do that in C++ is by placement new.

Even if the store to *p was valid, it doesn't change i's type (only
placement new can do that), so there is no way to access that value.
GCC relies on that by assuming that a store through a pointer to one
type doesn't change the value stored in declared objects of another
type.  It's a common bug to disregard this rule and that's I'm
interested in detecting.  I.e., aliasing violations.

the lifetime start of an object upon obtaining storage is a bit vague
which is why GIMPLE starts the lifetime only upon the first _store_.

Now I didn't find any restriction on how "storage with the proper
alignment and size for type T is obtained".  But of course restrictions
to the above may be scattered throughout the 1000+ pages of
the standard.

Storage for an object is obtained either by its declaration or by
an allocation call.


For C similar behavior is required if you ever implement a
memory allocator that re-uses storage.  (yeah, I know that
there's the argument that C doesn't allow to implement a
memory allocator ... but that's not practical, not for GIMPLE
at least)

Allocated storage is subject to slightly different rules that those
that apply to declared objects and I'm focusing only on the latter.

(The argument that C doesn't make it possible to implement a memory
allocator is about the strictly conforming subset of C, i.e.,
the subset that excludes unspecified and implementation-defined
behavior.)

But I'm still interested in:

  What sort of a markup would you suggest to use and on what trees?
  Would a bit on INDIRECT_REF do?

and

  ...unless there is some more straightforward way to change the type
  of a declared object than placement new (I don't know of one), would
  INDIRECT_REF alone, with no markup, be a sufficient indication that
  the access doesn't modify the type of the accessed object?

Thanks
Martin


Richard.

AFAIK, C and C++ share the same aliasing restrictions with
the exception of placement operator new.  The intent, made explicit
in Footnote 37 in C++ 98, is for the memory models of the two
languages to be the same.  (The same footnote is in all published
revisions of C++).


Likewise if there are other such constructs (are there?) they would
need be marked up somehow by the front end.

If the frontend requires that a store does not change the memorys
dynamic type (for diagnostic purposes) then it would need to mark
it in a special way.  By default any store in the GIMPLE IL alters
the dynamic type of the destination.

What sort of a markup would you suggest to use and on what trees?
Would a bit on INDIRECT_REF do?

But unless there is some more straightforward way to change the type
of a declared object than placement new, why would INDIRECT_REF alone,
with no markup, not be a sufficient indication that the access doesn't
modify the type of the accessed object?

Martin


I speculate that's what Jeff was suggesting by having the FE mark
up the code.

Martin


Whether this is done in gimple-array-bounds or some other pass seems
to me like a minor implementation detail.  It naturally came out of
an enhancement I implemented there (which would detect the above
with float replaced by any larger type as an out-of-bounds access)
but I have no problem with moving this subset to some other pass
(or duplicating it there).  In fact, as I said, I'd like to enhance
-Wstrict-aliasing to detect more bugs at some point, so that might
be a good time to move this instance of the warning there as well.
But the enhancement I'm thinking of is in the middle end, not in
the front end.

In any event, the warning is valid, just the phrasing is misleading
since there in the case of the struct member there isn't necessarily
any subscripting involved or even an access to members beyond the end
of the accessed object.  Issuing it under -Warray-bounds and with
-fno-strict-aliasing compounds the problem.  I put together this
patch in response to the feedback I got from you and from the reporter
in PR 98503 where you both made this point, so I'm not sure why
improving it as both of you suggested is an issue.

Martin



Reply via email to