[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-05-02 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #63 from rguenther at suse dot de  ---
On Wed, 2 May 2018, aph at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
> 
> --- Comment #62 from Andrew Haley  ---
> Just a bit of clarification:
> 
> (In reply to James Kuyper Jr. from comment #59)
> > 
> > > 1) all type-based alias analysis is effectively impossible
> > 
> > Alias analysis is only affected by the special guarantee if
> > a) the types involved are both struct types
> 
> > b) both struct types are members of the same union
> > c) the struct types share a common initial sequence
> 
> OK to all of those.
> 
> > d) the code in question inspects the value of one of the members of the
> > common initial sequence.
> 
> While this is a reasonable inference from what the text of the
> standard says, type-based alias analysis, by definition, does not pay
> any attention to what any piece of code does.  The analysis is purely
> type-based: that is to say, it only uses the types, and the only
> question it answers is "Do these types alias?"
> 
> > e) a completed declaration of the union type that they are members
> > of is visible at the point in the code where the inspection occurrs.
> 
> As explained elsewhere, TBAA doesn't use visibility as a criterion.
> 
> > It seems to me that the overwhelming majority of cases will fail to
> > meet at least one of those requirements, so type-based alias
> > analysis is still possible, it's just made more complicated by the
> > need to check for those things.
> 
> That's not quite right, as explained above.  If you use information
> other than types in alias analysis, it's no longer TBAA.  It is a
> fundamental principle of TBAA that the result of an aliasing query
> never changes for any pair of types.
> 
> We are extremely unlikely to redesign a big part of the optimizer for
> this dusty corner case.

Just sth I noticed.  The standard says
"it is permitted to inspect the common initial part of any of them"
and GCC already allows that.  But the testcase in this PR access
this common initial part via the actual structure types containing
this common initial sequences.  GCC has maintained the interpretation
of the standard that for struct S *p; an access like p->x is an
access of *p with respect to TBAA analysis.  But the standard doesn't
say you may access both structures containing the initial sequence
but it only says you may inspect the common initial part.  So if you
do

int f (struct t1 *p1, struct t2 *p2)
{
// union U visible here, p1->m and p2->m may alias
int *x = >m;
int *y = >m;
if (*x < 0)
*y = -*y;

return *x;
}

then it will work just fine.

I guess we all agree that the standards wording isn't 100% clear
and that it should be improved.

It may of course be that GCCs interpretation that p->x is an
access of *p isn't correct.  But then I don't need the union
clause because if p1->m is an access of 'int' only with
respect to TBAA then of course 'int' aliases 'int'.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-05-02 Thread aph at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #62 from Andrew Haley  ---
Just a bit of clarification:

(In reply to James Kuyper Jr. from comment #59)
> 
> > 1) all type-based alias analysis is effectively impossible
> 
> Alias analysis is only affected by the special guarantee if
> a) the types involved are both struct types

> b) both struct types are members of the same union
> c) the struct types share a common initial sequence

OK to all of those.

> d) the code in question inspects the value of one of the members of the
> common initial sequence.

While this is a reasonable inference from what the text of the
standard says, type-based alias analysis, by definition, does not pay
any attention to what any piece of code does.  The analysis is purely
type-based: that is to say, it only uses the types, and the only
question it answers is "Do these types alias?"

> e) a completed declaration of the union type that they are members
> of is visible at the point in the code where the inspection occurrs.

As explained elsewhere, TBAA doesn't use visibility as a criterion.

> It seems to me that the overwhelming majority of cases will fail to
> meet at least one of those requirements, so type-based alias
> analysis is still possible, it's just made more complicated by the
> need to check for those things.

That's not quite right, as explained above.  If you use information
other than types in alias analysis, it's no longer TBAA.  It is a
fundamental principle of TBAA that the result of an aliasing query
never changes for any pair of types.

We are extremely unlikely to redesign a big part of the optimizer for
this dusty corner case.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-05-01 Thread davmac at davmac dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #61 from Davin McCall  ---
(In reply to James Kuyper Jr. from comment #59)
> (In reply to Davin McCall from comment #56)
> > (In reply to James Kuyper Jr. from comment #55)
> > > The problem is, you're using a statement that the access must occur via a
> > > union, with the implication that the code in question does not access the
> > > member through the union.
> > 
> > If "via a union" allows that at some point that the address of a union
> > member was taken and that pointer is then dereferenced, and type punning via
> > a union is allowed (as is implied by another footnote in the same section),
> > then:
> 
> Footnote 95 is the only one I can find which allows type punning via a union
> - is that the one you're referring to? Footnote 95 makes absolutely no use
> of the word "via". It says, quite explicitly, "the member used to read the
> contents of a union", and therefore can't apply when not directly using an
> actual member to read it.

Since footnote 95 is a footnote, and therefore non-normative, so actual text
that would limit type punning to only direct union member access would have to
appear in the normative text, but doesn't. In an expression "u.a" where u is a
union object, the result "is an lvalue if" (u) "is an lvalue" (assume for this
example that it is) and also the value is "that of the named member" (a). To
apply the "&" operator to that (as in ""), then "the result is a pointer to
the object or function designated by its operand". When I then de-reference
that pointer, "the result is an lvalue designating the object", exactly as the
result of the member access operator. When the lvalue is used as per 6.3.2.1 it
is "converted to the value stored in the designated object" which, if it is
referring to a member object, seems to me to be the same as saying that it is
converted to the value "of the named member", as with direct member access.

Since the rest of my points hinged on the idea that is inconsistent to believe
that the clause regarding the common initial sequence applies universally while
type punning requires direct/immediate use of the union, and you disagree with
that tenet, then I don't see a need to go through the other points one by one
and I will let the matter rest unless you would like to discuss it privately
via email (to avoid making even more noise here).

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-05-01 Thread aph at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #60 from Andrew Haley  ---
(In reply to James Kuyper Jr. from comment #51)
> (In reply to Andrew Haley from comment #49)
> > (In reply to James Kuyper Jr. from comment #46)
> > 
> > The principle of type-based alias analysis is that all you know about
> > two types is their types, not the location of any code that uses them.
> > There are no scopes.  The oracle, given only the types, has to say
> > whether they alias or not, regardless of where those types are used in
> > a program.  The location isn't an input to the oracle.
> > 
> > Bear in mind that inlining and other kinds of code motion happen, and
> > code is often evaluated "outside" the scopes in which it was written
> > and in a completely different order.  That's all perfectly normal
> > optimization.
> > 
> > Besides, when the alias oracle is consulted, all that scope stuff has
> > gone.  It's only relevant to the front end.
> 
> I was only pointing out that implementing this special guarantee where it
> applies, and only where it applies, requires keeping information that must
> already have been collected. If the current design discards that information
> before performing the relevant optimizations, I can understand that this
> would require a significant re-design - but the re-design takes the form of
> saving information already collected, not of collecting additional
> information.

Well, yes, but this is all stuff we already know.  A total redesign of
alias analysis is not going to happen just for this rule.

> > > > So, if any union types with a common initial sequence are declared
> > > > anywhere in a program, then their member types alias.
> > > 
> > > As I understand it, the visibility rule was added specifically for
> > > the purpose of NOT requiring that the entire program be covered by
> > > this exception.
> > 
> > I don't think so.  As I read it, it was a way of declaring to the
> > compiler that they types are intended to alias.
>
> By "the visibility rule", I mean, very specifically, the phrase
> "anywhere that a declaration of the completed type of the union is
> visible". If the intent had been to disable aliasing throughout the
> entire program, that intent could have been expressed by simply
> removing those words entirely; if there was any doubt that people
> would understand the absence of those words correctly, then they
> could have been replaced with the phrase "anywhere, regardless of
> whether or not the completed type of the union was visible". I don't
> see any plausible reason for the committee to write "anywhere that a
> declaration of the completed type of the union is visible", unless
> that phrase was intended to restrict applicability of the special
> guarantee.

And in 1990s compiler technology it might well have been possible to
restrict the effect of this to a single function.  Back then it was
commonplace to parse a function, generate code, and then throw
everything except the code away.  But compiler technology has moved a
long way since then and it is inevitable that if we are to honour N685
we must coarsen the effect of the visibility of the union.

> > > Knowledgeable people writing code intended to take advantage of this
> > > feature of C are likely to carefully place completed declarations of
> > > the union's type so they disable those optimizations only where they
> > > need to be disabled, and to minimize the amount of code where this
> > > exception would unnecessarily disable useful optimizations.
> > 
> > Perhaps so, yes, but in practice it'd be pretty hard to do that.
> > Functions can only be defined in the other scope, and there's no way
> > to undefine a union type. 
> 
> True, but failing to define the union type is quite trivial. If I were
> writing code that used both struct types, but not the union type, and did
> nothing that relied upon the fact that they can alias each other, I would
> simply not #include the header that defines the completed union type,
> #including only the header that defines the struct types.

That'll be fine if you're only compiling a single translation unit at
a time.  If you're using link-time optimization, however, then the
effect of declaring structs in a union will inevitably result in those
structs being treated as aliases for the entire program being linked,
for the simple reason that the alias oracle always returns the same
answer when asked if two types are are aliases.  Therefore, if they're
aliases anywhere they must be aliases everywhere.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-05-01 Thread jameskuyper at alumni dot caltech.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #59 from James Kuyper Jr.  
---
(In reply to Davin McCall from comment #56)
> (In reply to James Kuyper Jr. from comment #55)
> > The problem is, you're using a statement that the access must occur via a
> > union, with the implication that the code in question does not access the
> > member through the union.
> 
> If "via a union" allows that at some point that the address of a union
> member was taken and that pointer is then dereferenced, and type punning via
> a union is allowed (as is implied by another footnote in the same section),
> then:

Footnote 95 is the only one I can find which allows type punning via a union -
is that the one you're referring to? Footnote 95 makes absolutely no use of the
word "via". It says, quite explicitly, "the member used to read the contents of
a union", and therefore can't apply when not directly using an actual member to
read it. What I said about allowing indirect access was specific to the special
guarantee from 6.5.2.3p6, and was not in any way intended to imply that
indirect  type punning using a union is allowed when that guarantee doesn't
apply.

> 1) all type-based alias analysis is effectively impossible

Alias analysis is only affected by the special guarantee if
a) the types involved are both struct types
b) both struct types are members of the same union
c) the struct types share a common initial sequence
d) the code in question inspects the value of one of the members of the common
initial sequence.
e) a completed declaration of the union type that they are members of is
visible  at the point in the code where the inspection occurrs.

It seems to me that the overwhelming majority of cases will fail to meet at
least one of those requirements, so type-based alias analysis is still
possible, it's just made more complicated by the need to check for those
things.


> The real problem is that "it is permitted to inspect" doesn't say how one
> should perform an "inspection" nor what the result should be. You want it to
> mean "access (read) the structure member in the normal way and have its
> value match that of the corresponding structure member from the common
> initial sequence of the active member". But the "special guarantee" grants a
> permission, which is most easily read as not doing anything other than
> specifying that a certain action (reading a struct member) doesn't have
> undefined behaviour in certain circumstances.

Well, that's sufficiently vague that I can agree with it. It's the fact that,
in other circumstance, the behavior is undefined, that allows optimizations
that would fail if the pointers alias each other. Such optimizations are
therefore not allowed in the circumstances where 6.5.3.2p6 applies.

> It's not even actually explicated that the value read should match that of
> the corresponding common-initial-sequence member of the struct object that
> is the active member of the union object in question; in thinking that it
> should be, we're already making the assumption that this clause is intended
> to permit a certain case of type-punning. But, as I noted above, if
> type-punning is generally allowed,

Which is NOT what I claimed.

> ... and if accessing via the union
> "immediately" has the same  semantics as taking the address of the union
> member and accessing via the resulting pointer

Which I only claimed to be true when the special guarantee applies.

> ... - then the clause isn't
> necessary anyway,

Since I didn't make the general claims you're asserting that I made, the clause
is necessary.

> ... except to mandate that the common-initial-sequence layout
> is identical between distinct structs which are punned in this way, and in
> that case what is the point of requiring that the union declaration be
> visible?

It's a pre-condition for the indirect access to be valid, which would otherwise
not be allowed.

> ... (Unless you want to argue that the point is to mandate the common
> initial sequence layout is necessarily identical only if the union
> declaration is visible;

No, I think the primary point is to disallow optimizations based upon the
normal assumption that the two struct types can't alias each other. However,
the standard does not otherwise constrain the layout of any member of a struct
type other than the first, so in any context where 6.5.2.3p6 doesn't interfere,
the common initial sequence is allowed to have different layouts in the
different struct types. I can't imagine any good reason for an implementation
to do so - I'd expect that for any given value of n, for any given
implementation of C, the location within a struct of the nth member is
determined uniquely by the types of the preceding members - but the standard
doesn't require that to be the case.

> So for your interpretation I believe you need that either:
> 
> 1) type punning via a union is not normally permissible, despite the
> footnote claiming it is, and

It is permitted to use a 

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-05-01 Thread davmac at davmac dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #58 from Davin McCall  ---
(In reply to Andrew Haley from comment #57)
> (In reply to Davin McCall from comment #52)
> > (In reply to Andrew Haley from comment #45)
> > > (In reply to Davin McCall from comment #44)
> > > > The "one special guarantee" clause appears in the section describing 
> > > > union
> > > > member access via the "." or "->" operators, implying that it only 
> > > > applies
> > > > to the access of union members via the union.
> > > 
> > > I don't believe that's what is intended, or that you can make such a
> > > conclusion based on the section in which the rule appears.  It applies
> > > to other accesses too, as is (somewhat) made clear by the rationale in
> > > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n685.htm:
> > 
> > It certainly may not be what is intended by N685, but I think it's normally
> > reasonable to conclude that a statement in a particular section of a
> > document applies to that section and not more universally than that; in this
> > case, the "universal" interpretation flatly contradicts the strict aliasing
> > rule and any other rule which would otherwise disallow access, which seems
> > extremely problematic to me.
> > 
> > In general it appears the committee have asserted that the "universal"
> > interpretation (which since N685 requires visibility of the union
> > declaration to be effective) is the correct one, but my argument
> 
> ... doesn't really matter from a practical point of view, does it?
> That ship has sailed.

Well, if the amendment doesn't make sense, I'd say it matters from a practical
point of view, yes. It can always be amended again.

> > is that the actual text of the standard strongly implies something
> > different, and that the interpretation being pushed instead turns
> > another portion of the standard text into nonsense.
> 
> I don't think that's it really does, but I think we're done.

I've laid it out as best as I can in comment #56, and certainly don't have more
to add.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-05-01 Thread aph at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #57 from Andrew Haley  ---
(In reply to Davin McCall from comment #52)
> (In reply to Andrew Haley from comment #45)
> > (In reply to Davin McCall from comment #44)
> > > The "one special guarantee" clause appears in the section describing union
> > > member access via the "." or "->" operators, implying that it only applies
> > > to the access of union members via the union.
> > 
> > I don't believe that's what is intended, or that you can make such a
> > conclusion based on the section in which the rule appears.  It applies
> > to other accesses too, as is (somewhat) made clear by the rationale in
> > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n685.htm:
> 
> It certainly may not be what is intended by N685, but I think it's normally
> reasonable to conclude that a statement in a particular section of a
> document applies to that section and not more universally than that; in this
> case, the "universal" interpretation flatly contradicts the strict aliasing
> rule and any other rule which would otherwise disallow access, which seems
> extremely problematic to me.
> 
> In general it appears the committee have asserted that the "universal"
> interpretation (which since N685 requires visibility of the union
> declaration to be effective) is the correct one, but my argument

... doesn't really matter from a practical point of view, does it?
That ship has sailed.

> is that the actual text of the standard strongly implies something
> different, and that the interpretation being pushed instead turns
> another portion of the standard text into nonsense.

I don't think that's it really does, but I think we're done.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-05-01 Thread davmac at davmac dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #56 from Davin McCall  ---
(In reply to James Kuyper Jr. from comment #55)
> The problem is, you're using a statement that the access must occur via a
> union, with the implication that the code in question does not access the
> member through the union.

If "via a union" allows that at some point that the address of a union member
was taken and that pointer is then dereferenced, and type punning via a union
is allowed (as is implied by another footnote in the same section), then:

1) all type-based alias analysis is effectively impossible
2) the "special guarantee" clause is completely redundant, and the requirement
for visibility of the union declaration doubly so.

The real problem is that "it is permitted to inspect" doesn't say how one
should perform an "inspection" nor what the result should be. You want it to
mean "access (read) the structure member in the normal way and have its value
match that of the corresponding structure member from the common initial
sequence of the active member". But the "special guarantee" grants a
permission, which is most easily read as not doing anything other than
specifying that a certain action (reading a struct member) doesn't have
undefined behaviour in certain circumstances.

It's not even actually explicated that the value read should match that of the
corresponding common-initial-sequence member of the struct object that is the
active member of the union object in question; in thinking that it should be,
we're already making the assumption that this clause is intended to permit a
certain case of type-punning. But, as I noted above, if type-punning is
generally allowed, and if accessing via the union "immediately" has the same 
semantics as taking the address of the union member and accessing via the
resulting pointer - then the clause isn't necessary anyway, except to mandate
that the common-initial-sequence layout is identical between distinct structs
which are punned in this way, and in that case what is the point of requiring
that the union declaration be visible? (Unless you want to argue that the point
is to mandate the common initial sequence layout is necessarily identical only
if the union declaration is visible; however, since the layout necessarily
applies to the rest of the program also, why should it matter where the union
declaration is?).

So for your interpretation I believe you need that either:

1) type punning via a union is not normally permissible, despite the footnote
claiming it is, and
2) a lot of production code is broken.

or

1) type punning via a union is permissible and the "special guarantee" clause
serves only to enforce common layout of structs, and the union declaration
amendment is not sensible, and
2) TBAA is impossible and most current compilers are broken.

or

1) type punning via a union is permissible, but the semantics of accessing a
member of the union "immediately" do differ to those of taking the address of
the member and later dereferencing it, despite the fact that the text does not
explicate this, and
2) the "special guarantee" clause changes the semantics of "indirect" union
member access to match those of "direct" member access, in specified cases,
despite that the present wording only dances around this topic without ever
touching it.

> The standard explicitly says, referring to the same example mentioned in DR
> 257, that the second code fragment is not valid, but only "because the union
> type is not visible within function f", implying that it would be valid if
> the declaration of the union type were moved so that it would be visible
> inside f(). If it were so moved, it would be essentially equivalent to the
> code which was the original defect report. While examples are non-normative,
> that example implies that the visibility clause was intended to actually
> serve a purpose (and it seems obvious to me that it actually does so).

I'm not arguing that N685 wasn't intended to do exactly as you suggest, but I'm
not sure the wording pre-amendment really suffered from the problem that N685
supposedly addresses, and I certainly don't agree that the amended wording is
clear in meaning.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread jameskuyper at alumni dot caltech.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #55 from James Kuyper Jr.  
---
> ou need, at a minimum, to modify "accesses via" to
> "accesses directly via", in order to convey your intended meaning.
(In reply to Davin McCall from comment #54)
> (In reply to James Kuyper Jr. from comment #53)
> > [...] However, because those
> > pointers are passed to f(), which does dereference them, f() does accesses
> > those members, and it does so via the use of the '.' operator in the calling
> > routine. Therefore, you need, at a minimum, to modify "accesses via" to
> > "accesses directly via", in order to convey your intended meaning.
> 
> I thought it was clear that I was referring to access via the union. That is
> certainly what I did mean.

The problem is, you're using a statement that the access must occur via a
union, with the implication that the code in question does not access the
member through the union. The code in the original bug report does in fact
access the members through the union - indirectly, but through the union. It's
not possible to bypass the u.s1 step; the fact that u.s1 is the operand of an &
operator and the resulting pointer value is an argument to a function call, and
that the called function is the one that actually accesses the member through
that pointer, does not change the fact that the access came about as the result
of the use of the '.' operator on a union object. Therefore, if you want your
wording to convey your belief that such indirect use of the member selection
operator is excluded, your wording needs modification to make that clear. Of
course, if so modified, it would be saying something with no actual support in
the C standard.

> > I don't see anything in the standard's wording of 6.5.2.3p6 to justify
> > restricting what it says to direct accesses - it says "it is permitted to
> > inspect", without specifying restrictions on how the inspection may be
> > performed.
> 
> As I have said, it is in a section regarding access and in a paragraph
> discussing "use of unions". While I understand what you are saying, I don't
> feel my own interpretation is really that difficult to fathom, and I'm not
> the only one to take it. See http://archive.is/PnW28 (DR 257).

True, but keep in mind that the committee did not agree with his objections.
The example he was complaining about is still present in the current version of
the standard, without any changes that address the issues he raised (I happen
to agree with him that it would have been better to use a common initial
sequence with a length greater than one member, and to use a member other than
the first one for the example).

> > The words "anywhere that a declaration of the completed type of the union
> > is visible." would become pointless with your interpretation.
> 
> Yes, as I already said.

The standard explicitly says, referring to the same example mentioned in DR
257, that the second code fragment is not valid, but only "because the union
type is not visible within function f", implying that it would be valid if the
declaration of the union type were moved so that it would be visible inside
f(). If it were so moved, it would be essentially equivalent to the code which
was the original defect report. While examples are non-normative, that example
implies that the visibility clause was intended to actually serve a purpose
(and it seems obvious to me that it actually does so).

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread davmac at davmac dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #54 from Davin McCall  ---
(In reply to James Kuyper Jr. from comment #53)
> [...] However, because those
> pointers are passed to f(), which does dereference them, f() does accesses
> those members, and it does so via the use of the '.' operator in the calling
> routine. Therefore, you need, at a minimum, to modify "accesses via" to
> "accesses directly via", in order to convey your intended meaning.

I thought it was clear that I was referring to access via the union. That is
certainly what I did mean.

> 
> I don't see anything in the standard's wording of 6.5.2.3p6 to justify
> restricting what it says to direct accesses - it says "it is permitted to
> inspect", without specifying restrictions on how the inspection may be
> performed.

As I have said, it is in a section regarding access and in a paragraph
discussing "use of unions". While I understand what you are saying, I don't
feel my own interpretation is really that difficult to fathom, and I'm not the
only one to take it. See http://archive.is/PnW28 (DR 257).

> The words "anywhere that a declaration of the completed type of the union
> is visible." would become pointless with your interpretation.

Yes, as I already said.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread jameskuyper at alumni dot caltech.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #53 from James Kuyper Jr.  
---
(In reply to Davin McCall from comment #52)
> (In reply to James Kuyper Jr. from comment #48)
> > > The "one special guarantee" clause appears in the section describing union
> > > member access via the "." or "->" operators, implying that it only applies
> > > to the access of union members via the union. ...
> > 
> > I find nothing objectionable about that statement - it is indeed impossible
> > to create code which relies upon the special guarantee in 6.5.2.3p6 without
> > accessing the union members via the '.' or '->' operators. However, I
> > believe that you mean something more restricted than what you're actually
> > saying, because the code given in the original bug report does in fact
> > access the union members via '.' operator, in the expressions  and
> > , to create a situation where, as I understand it, that special
> > guarantee is fully applicable.
> > Could you expand on your description of what you think is required, to make
> > it clear why it doesn't apply in this case?
> 
> It isn't clear that "" for example actually accesses either "u" or its
> member "s1", and I would argue that it doesn't for either.

I agree: that expression does not access u or s1. However, because those
pointers are passed to f(), which does dereference them, f() does accesses
those members, and it does so via the use of the '.' operator in the calling
routine. Therefore, you need, at a minimum, to modify "accesses via" to
"accesses directly via", in order to convey your intended meaning.

I don't see anything in the standard's wording of 6.5.2.3p6 to justify
restricting what it says to direct accesses - it says "it is permitted to
inspect", without specifying restrictions on how the inspection may be
performed.

The words "anywhere that a declaration of the completed type of the union
is visible." would become pointless with your interpretation. You already need
a visible complete declaration of the union to directly access it's members
without violating a constraint. Those words are only needed if the guarantee
was intended to apply even when the access is not direct.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread davmac at davmac dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #52 from Davin McCall  ---
(In reply to James Kuyper Jr. from comment #48)
> > The "one special guarantee" clause appears in the section describing union
> > member access via the "." or "->" operators, implying that it only applies
> > to the access of union members via the union. ...
> 
> I find nothing objectionable about that statement - it is indeed impossible
> to create code which relies upon the special guarantee in 6.5.2.3p6 without
> accessing the union members via the '.' or '->' operators. However, I
> believe that you mean something more restricted than what you're actually
> saying, because the code given in the original bug report does in fact
> access the union members via '.' operator, in the expressions  and
> , to create a situation where, as I understand it, that special
> guarantee is fully applicable.
> Could you expand on your description of what you think is required, to make
> it clear why it doesn't apply in this case?

It isn't clear that "" for example actually accesses either "u" or its
member "s1", and I would argue that it doesn't for either. I read it how (if I
understand correctly) GCC has up until now interpreted it: the "special
guarantee" is for expressions directly involving member access via the union.
Once you take the address of the member, and later dereference it via "*", you
are dealing with a different operator and the guarantee doesn't apply.

I'll admit that this is still making some assumptions, but it's an
interpretation that is far more at peace with the rest of the standard.

(In reply to Andrew Haley from comment #45)
> (In reply to Davin McCall from comment #44)
> > The "one special guarantee" clause appears in the section describing union
> > member access via the "." or "->" operators, implying that it only applies
> > to the access of union members via the union.
> 
> I don't believe that's what is intended, or that you can make such a
> conclusion based on the section in which the rule appears.  It applies
> to other accesses too, as is (somewhat) made clear by the rationale in
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n685.htm:

It certainly may not be what is intended by N685, but I think it's normally
reasonable to conclude that a statement in a particular section of a document
applies to that section and not more universally than that; in this case, the
"universal" interpretation flatly contradicts the strict aliasing rule and any
other rule which would otherwise disallow access, which seems extremely
problematic to me.

In general it appears the committee have asserted that the "universal"
interpretation (which since N685 requires visibility of the union declaration
to be effective) is the correct one, but my argument is that the actual text of
the standard strongly implies something different, and that the interpretation
being pushed instead turns another portion of the standard text into nonsense.
It's extremely problematic in my view that a more reasonable reading is
considered incorrect and that this can only be known with external knowledge
outside the text of the specification itself.

Never the less, I take your point.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread jameskuyper at alumni dot caltech.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #51 from James Kuyper Jr.  
---
(In reply to Andrew Haley from comment #49)
> (In reply to James Kuyper Jr. from comment #46)
> > (In reply to Andrew Haley from comment #42)
> > ...
> > > In order to use type-based alias analysis in any LTO framework it's
> > > necessary to save type information, and this is just more type
> > > information. ...
> > 
> > > ... The question is, I suppose, how to handle the scopes of
> > > union declarations.  I'd just treat them as being global, which in
> > > practice isn't unrealistic because such declarations are in header
> > > files in global scope and shared anyway.
> > 
> > Why not use the actual scope of the completed union declaration,
> > which is what the relevant rule refers to?
> 
> The principle of type-based alias analysis is that all you know about
> two types is their types, not the location of any code that uses them.
> There are no scopes.  The oracle, given only the types, has to say
> whether they alias or not, regardless of where those types are used in
> a program.  The location isn't an input to the oracle.
> 
> Bear in mind that inlining and other kinds of code motion happen, and
> code is often evaluated "outside" the scopes in which it was written
> and in a completely different order.  That's all perfectly normal
> optimization.
> 
> Besides, when the alias oracle is consulted, all that scope stuff has
> gone.  It's only relevant to the front end.

I was only pointing out that implementing this special guarantee where it
applies, and only where it applies, requires keeping information that must
already have been collected. If the current design discards that information
before performing the relevant optimizations, I can understand that this would
require a significant re-design - but the re-design takes the form of saving
information already collected, not of collecting additional information.

> > > So, if any union types with a common initial sequence are declared
> > > anywhere in a program, then their member types alias.
> > 
> > As I understand it, the visibility rule was added specifically for
> > the purpose of NOT requiring that the entire program be covered by
> > this exception.
> 
> I don't think so.  As I read it, it was a way of declaring to the
> compiler that they types are intended to alias.

By "the visibility rule", I mean, very specifically, the phrase "anywhere that
a declaration of the completed type of the union is visible". If the intent had
been to disable aliasing throughout the entire program, that intent could have
been expressed by simply removing those words entirely; if there was any doubt
that people would understand the absence of those words correctly, then they
could have been replaced with the phrase "anywhere, regardless of whether or
not the completed type of the union was visible". I don't see any plausible
reason for the committee to write "anywhere that a declaration of the completed
type of the union is visible", unless that phrase was intended to restrict
applicability of the special guarantee.

> > Knowledgeable people writing code intended to take advantage of this
> > feature of C are likely to carefully place completed declarations of
> > the union's type so they disable those optimizations only where they
> > need to be disabled, and to minimize the amount of code where this
> > exception would unnecessarily disable useful optimizations.
> 
> Perhaps so, yes, but in practice it'd be pretty hard to do that.
> Functions can only be defined in the other scope, and there's no way
> to undefine a union type. 

True, but failing to define the union type is quite trivial. If I were writing
code that used both struct types, but not the union type, and did nothing that
relied upon the fact that they can alias each other, I would simply not
#include the header that defines the completed union type, #including only the
header that defines the struct types. If I needed to put such code in the same
translation unit as code which actually needs the union declaration, I would
put the code that doesn't need it before the #include, and put the code that
does need it after the #include - but that would probably be more trouble than
it's worth.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread aph at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #50 from Andrew Haley  ---
(In reply to Andrew Haley from comment #49)
> 
> Perhaps so, yes, but in practice it'd be pretty hard to do that.
> Functions can only be defined in the other scope,

Should be "the outer scope"

> and there's no way
> to undefine a union type.  I guess you could be clever and put all of
> the functions which needed to know about the aliasing at the end of a
> translation unit.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread aph at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #49 from Andrew Haley  ---
(In reply to James Kuyper Jr. from comment #46)
> (In reply to Andrew Haley from comment #42)
> ...
> > In order to use type-based alias analysis in any LTO framework it's
> > necessary to save type information, and this is just more type
> > information. ...
> 
> > ... The question is, I suppose, how to handle the scopes of
> > union declarations.  I'd just treat them as being global, which in
> > practice isn't unrealistic because such declarations are in header
> > files in global scope and shared anyway.
> 
> Why not use the actual scope of the completed union declaration,
> which is what the relevant rule refers to?

The principle of type-based alias analysis is that all you know about
two types is their types, not the location of any code that uses them.
There are no scopes.  The oracle, given only the types, has to say
whether they alias or not, regardless of where those types are used in
a program.  The location isn't an input to the oracle.

Bear in mind that inlining and other kinds of code motion happen, and
code is often evaluated "outside" the scopes in which it was written
and in a completely different order.  That's all perfectly normal
optimization.

Besides, when the alias oracle is consulted, all that scope stuff has
gone.  It's only relevant to the front end.  

> > So, if any union types with a common initial sequence are declared
> > anywhere in a program, then their member types alias.
> 
> As I understand it, the visibility rule was added specifically for
> the purpose of NOT requiring that the entire program be covered by
> this exception.

I don't think so.  As I read it, it was a way of declaring to the
compiler that they types are intended to alias.

> Knowledgeable people writing code intended to take advantage of this
> feature of C are likely to carefully place completed declarations of
> the union's type so they disable those optimizations only where they
> need to be disabled, and to minimize the amount of code where this
> exception would unnecessarily disable useful optimizations.

Perhaps so, yes, but in practice it'd be pretty hard to do that.
Functions can only be defined in the other scope, and there's no way
to undefine a union type.  I guess you could be clever and put all of
the functions which needed to know about the aliasing at the end of a
translation unit.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread jameskuyper at alumni dot caltech.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #48 from James Kuyper Jr.  
---
(In reply to Davin McCall from comment #44)
> > Well, perhaps not, but this is the language specification.
> 
> The "one special guarantee" clause appears in the section describing union
> member access via the "." or "->" operators, implying that it only applies
> to the access of union members via the union. ...

I find nothing objectionable about that statement - it is indeed impossible to
create code which relies upon the special guarantee in 6.5.2.3p6 without
accessing the union members via the '.' or '->' operators. However, I believe
that you mean something more restricted than what you're actually saying,
because the code given in the original bug report does in fact access the union
members via '.' operator, in the expressions  and , to create a
situation where, as I understand it, that special guarantee is fully
applicable.
Could you expand on your description of what you think is required, to make it
clear why it doesn't apply in this case?

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread aph at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #47 from Andrew Haley  ---
(In reply to Richard Biener from comment #43)
> (In reply to Andrew Haley from comment #42)
> > 
> > So, if any union types with a common initial sequence are declared
> > anywhere in a program, then their member types alias.  Alternatively,
> > a tighter implementation might restrict such declarations to a
> > compilation unit, in which case the alias oracle would scan only the
> > union types declared in that unit.
> 
> So for the middle-end the easiest thing would be if the FE would comply
> to its existing semantics and for the initial sequences generate a
> transparent struct.  Thus,
> 
> union {
>  struct A { int i; float f; double z; } a;
>  struct B { int i; float f; void *p; } b;
> };
> 
> would cause the FE to "implement" struct A and B like
> 
>  struct __init_seq1 { int i; float f; };
>  struct A { struct __init_seq1 _transp_memb1; double z; } a;
>  struct B { struct __init_seq1 _transp_memb2; void *p; } b;
> 
> then everything would work as expected from an aliasing point of view.
> The difficulty is probably that argument passing of A and B might
> change depending on how the ABIs are defined and how the backend handles
> those wrapping structs.

Nice.  I've got to admit that's a clever, idea, but it's also a very
big gotcha.

> But as you can clearly see the above would be also a way for the user
> to get what the clause permits without the clause being present.  So
> I'm not sure why this clause was added.

That's somewhat explained by N685, which does contain the rationale.
In short: proposal before N685 was to allow *every* pair of pointers
to structures with a common initial sequence to alias.  The revised
version (which was accepted) restricts this to structures with a
common initial sequence where a union of these structures is visible
to the compiler.

> language specifications have defects ...

Yabbut, N685 was accepted and the proposal does explain why.  Maybe it
shouldn't have been done that way, but it was done, and it was done
deliberately, as far as I can see.

> > > When I read the language text then a union declaration in between
> > > two accesses will change the semantic of the second?
> > 
> > Not necessarily.  It would be correct to collect all union
> > declarations at the end of parsing and then use those to feed the
> > alias oracle.  There's no actual need to restrict their scope.  Sure,
> > it would lead to GCC being somewhat over-cautious, but that's OK.
> 
> given the TBAA oracle is filled on-demand it is important that both
> outcomes are allowed. 

Okay, I don't get this.  Why not simply say that if a union type with
the initial common sequence exists anywhere, it is as though it were
declared at the start of every TU.

> I still don't see how we can make it work easily in the middle-end.

I don't think I ever said it would be easy!  I am saying, though, that
it's not the end of TBAA as we know it, but a refinement in which a
front end can feed into the alias oracle sets of types that are known
to alias.

You can think of it as a declaration:

__alias__ {
  type_a, type_b, type_c
};

which is an additional input to the oracle.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread jameskuyper at alumni dot caltech.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #46 from James Kuyper Jr.  
---
(In reply to Andrew Haley from comment #42)
...
> In order to use type-based alias analysis in any LTO framework it's
> necessary to save type information, and this is just more type
> information. ...

Speaking from a developer's perspective rather than an implementor's
perspective, the implementation already needs to keep track of where the
union's completed definition is in scope, because it's only in those locations
where it would be permitted to define an object having the union's type. This
is just a different use for the same information; it shouldn't require storing
any additional information; nor should it require holding on to that
information for any longer than is already required.
As a matter of efficient implementation, rather than correctness, I think it
might be useful to store, for each struct type, a list of the union definitions
for which this might be an issue, a list that would not be needed for any other
reason. However, most of the time, that list would be empty - only when it's
not empty would the compiler need to review that list to determine which other
struct types might be permitted to alias this type, to a limited extent.

> ... The question is, I suppose, how to handle the scopes of
> union declarations.  I'd just treat them as being global, which in
> practice isn't unrealistic because such declarations are in header
> files in global scope and shared anyway.

Why not use the actual scope of the completed union declaration, which is what
the relevant rule refers to?

> So, if any union types with a common initial sequence are declared
> anywhere in a program, then their member types alias.

As I understand it, the visibility rule was added specifically for the purpose
of NOT requiring that the entire program be covered by this exception.
Knowledgeable people writing code intended to take advantage of this feature of
C are likely to carefully place completed declarations of the union's type so
they disable those optimizations only where they need to be disabled, and to
minimize the amount of code where this exception would unnecessarily disable
useful optimizations.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread aph at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #45 from Andrew Haley  ---
(In reply to Davin McCall from comment #44)
> > Well, perhaps not, but this is the language specification.
> 
> The "one special guarantee" clause appears in the section describing union
> member access via the "." or "->" operators, implying that it only applies
> to the access of union members via the union.

I don't believe that's what is intended, or that you can make such a
conclusion based on the section in which the rule appears.  It applies
to other accesses too, as is (somewhat) made clear by the rationale in
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n685.htm:

The proposed solution is to require that a union declaration be visible
if aliases through a common initial sequence (like the above) are possible.
Therefore the following TU provides this kind of aliasing if desired:

union utag {
  struct tag1 { int m1; double d2; } st1;
  struct tag2 { int m1; char c2; } st2;
};

int similar_func(struct tag1 *pst2, struct tag2 *pst3) {
  pst2->m1 = 2;
  pst3->m1 = 0;   /* might be an alias for pst2->m1 */
  return pst2->m1;
}

I know this is non-normative and not even in the standard, but it does
explain what was intended.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread davmac at davmac dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #44 from Davin McCall  ---
> Well, perhaps not, but this is the language specification.

The "one special guarantee" clause appears in the section describing union
member access via the "." or "->" operators, implying that it only applies to
the access of union members via the union. As has been pointed out by others,
the guarantee is surely not meant to trump all other rules regarding access, so
this is a reasonable interpretation (since otherwise, it is totally unclear
when it does apply and what exactly "it is permitted" even means).

Note that without that clause, type punning structs via a union would
essentially be impossible (since layout is implementation defined or
unspecified). The "common initial sequence" requirement is the only part of the
standard which requires that structs with similar members have them layed out
in the same order and alignment. Since this only matters for type punning, it
again makes sense that this would be specified in the one section which
actually allows for type punning (even if only in a non-normative footnote) -
that is, union member access via a union. It's clear why it is needed for this,
but to extend that to any access of union members (including not via the union)
seems like a stretch. If that was intended, why isn't it specified in 6.5?

The only thing that suggests an alternative interpretation to what I've
described above is the requirement that the declaration of the completed type
of the union be visible, which is redundant if the access must be via the union
type. However, interpreting this to mean that the "special guarantee" applies
globally is far more problematic than assuming that the requirement is just
redundant.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #43 from Richard Biener  ---
(In reply to Andrew Haley from comment #42)
> On 04/29/2018 05:42 PM, rguenther at suse dot de wrote:>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
> > 
> > --- Comment #41 from rguenther at suse dot de  ---
> > On April 29, 2018 1:51:58 PM GMT+02:00, "aph at gcc dot gnu.org"
> >  wrote:
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
> >>
> >> --- Comment #40 from Andrew Haley  ---
> >> (In reply to rguent...@suse.de from comment #29)
> >>
> >>> Note I repeatedly said this part of the standard is just stupid.  It
> >> makes
> >>> most if not all type-based alias analysis useless.
> >>
> >> I don't think so.  It does mean that we'd have to feed all declared
> >> union types (or, at least the ones containing structs with common
> >> initial sequences) into the alias oracle.  While unpleasant, in that
> >> simply declaring a type without even declaring an object of that type
> >> changes code generation, it doesn't render all type-based alias
> >> analysis useless.
> > 
> > How do you handle this within the LTO framework?
> 
> In order to use type-based alias analysis in any LTO framework it's
> necessary to save type information, and this is just more type
> information.  The question is, I suppose, how to handle the scopes of
> union declarations.  I'd just treat them as being global, which in
> practice isn't unrealistic because such declarations are in header
> files in global scope and shared anyway.
> 
> So, if any union types with a common initial sequence are declared
> anywhere in a program, then their member types alias.  Alternatively,
> a tighter implementation might restrict such declarations to a
> compilation unit, in which case the alias oracle would scan only the
> union types declared in that unit.

So for the middle-end the easiest thing would be if the FE would comply
to its existing semantics and for the initial sequences generate a
transparent struct.  Thus,

union {
 struct A { int i; float f; double z; } a;
 struct B { int i; float f; void *p; } b;
};

would cause the FE to "implement" struct A and B like

 struct __init_seq1 { int i; float f; };
 struct A { struct __init_seq1 _transp_memb1; double z; } a;
 struct B { struct __init_seq1 _transp_memb2; void *p; } b;

then everything would work as expected from an aliasing point of view.
The difficulty is probably that argument passing of A and B might
change depending on how the ABIs are defined and how the backend handles
those wrapping structs.

But as you can clearly see the above would be also a way for the user
to get what the clause permits without the clause being present.  So
I'm not sure why this clause was added.

> >>> Which means I'll refuse any patches implementing it in a way that
> >>> affects default behavior.
> >>
> >> Maybe --pedantic or even --pedantic-aliasing?
> > 
> > Whatever you call it I doubt any working solution will fit nicely
> > into our existing TBAA framework.
> 
> Well, perhaps not, but this is the language specification.

language specifications have defects ...

> > When I read the language text then a union declaration in between
> > two accesses will change the semantic of the second?
> 
> Not necessarily.  It would be correct to collect all union
> declarations at the end of parsing and then use those to feed the
> alias oracle.  There's no actual need to restrict their scope.  Sure,
> it would lead to GCC being somewhat over-cautious, but that's OK.

given the TBAA oracle is filled on-demand it is important that both
outcomes are allowed.  I still don't see how we can make it work easily
in the middle-end.

For anyone wanting to make GCC comply I suggest the above sketched route
and start with looking how backends deal with this kind of wrapping in
their argument passing.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-30 Thread aph at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #42 from Andrew Haley  ---
On 04/29/2018 05:42 PM, rguenther at suse dot de wrote:>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
> 
> --- Comment #41 from rguenther at suse dot de  ---
> On April 29, 2018 1:51:58 PM GMT+02:00, "aph at gcc dot gnu.org"
>  wrote:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
>>
>> --- Comment #40 from Andrew Haley  ---
>> (In reply to rguent...@suse.de from comment #29)
>>
>>> Note I repeatedly said this part of the standard is just stupid.  It
>> makes
>>> most if not all type-based alias analysis useless.
>>
>> I don't think so.  It does mean that we'd have to feed all declared
>> union types (or, at least the ones containing structs with common
>> initial sequences) into the alias oracle.  While unpleasant, in that
>> simply declaring a type without even declaring an object of that type
>> changes code generation, it doesn't render all type-based alias
>> analysis useless.
> 
> How do you handle this within the LTO framework?

In order to use type-based alias analysis in any LTO framework it's
necessary to save type information, and this is just more type
information.  The question is, I suppose, how to handle the scopes of
union declarations.  I'd just treat them as being global, which in
practice isn't unrealistic because such declarations are in header
files in global scope and shared anyway.

So, if any union types with a common initial sequence are declared
anywhere in a program, then their member types alias.  Alternatively,
a tighter implementation might restrict such declarations to a
compilation unit, in which case the alias oracle would scan only the
union types declared in that unit.

>>> Which means I'll refuse any patches implementing it in a way that
>>> affects default behavior.
>>
>> Maybe --pedantic or even --pedantic-aliasing?
> 
> Whatever you call it I doubt any working solution will fit nicely
> into our existing TBAA framework.

Well, perhaps not, but this is the language specification.

> When I read the language text then a union declaration in between
> two accesses will change the semantic of the second?

Not necessarily.  It would be correct to collect all union
declarations at the end of parsing and then use those to feed the
alias oracle.  There's no actual need to restrict their scope.  Sure,
it would lead to GCC being somewhat over-cautious, but that's OK.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-29 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #41 from rguenther at suse dot de  ---
On April 29, 2018 1:51:58 PM GMT+02:00, "aph at gcc dot gnu.org"
 wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
>
>--- Comment #40 from Andrew Haley  ---
>(In reply to rguent...@suse.de from comment #29)
>
>> Note I repeatedly said this part of the standard is just stupid.  It
>makes
>> most if not all type-based alias analysis useless.
>
>I don't think so.  It does mean that we'd have to feed all declared
>union types (or, at least the ones containing structs with common
>initial sequences) into the alias oracle.  While unpleasant, in that
>simply declaring a type without even declaring an object of that type
>changes code generation, it doesn't render all type-based alias
>analysis useless.

How do you handle this within the LTO framework?

>> Which means I'll refuse any patches implementing it in a way that
>affects
>> default behavior.
>
>Maybe --pedantic or even --pedantic-aliasing?

Whatever you call it I doubt any working solution will fit nicely into our
existing TBAA framework. 

When I read the language text then a union declaration in between two accesses
will change the semantic of the second?

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-29 Thread aph at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #40 from Andrew Haley  ---
(In reply to rguent...@suse.de from comment #29)

> Note I repeatedly said this part of the standard is just stupid.  It makes
> most if not all type-based alias analysis useless.

I don't think so.  It does mean that we'd have to feed all declared
union types (or, at least the ones containing structs with common
initial sequences) into the alias oracle.  While unpleasant, in that
simply declaring a type without even declaring an object of that type
changes code generation, it doesn't render all type-based alias
analysis useless.

> Which means I'll refuse any patches implementing it in a way that affects
> default behavior.

Maybe --pedantic or even --pedantic-aliasing?

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-23 Thread joseph at codesourcery dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #39 from joseph at codesourcery dot com  ---
On Thu, 19 Apr 2018, jameskuyper at verizon dot net wrote:

> Code which relies upon this feature to implement a C-style approximation to
> inheritance has been fairly common, which is precisely why the C committee
> decided to create this rule, to make sure such code had well-defined behavior.

To make sure such code had well-defined behavior *notwithstanding the 
adjacent rule (in C90) that access to a non-current union member was 
otherwise implementation-defined*.  Not overriding any other rule 
elsewhere in the standard that might make such accesses undefined, such as 
type-based aliasing, even though it's subsequently sometimes been 
interpreted in connection with such rules (and access to a non-current 
union member is now non-normatively specified in a footnote as type 
punning).

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-23 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #38 from rguenther at suse dot de  ---
On Fri, 20 Apr 2018, msebor at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
> 
> --- Comment #35 from Martin Sebor  ---
> Here are the proposed changes:
> 
> Pointer Provenance:
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2219.htm#proposed-technical-corrigendum

Mostly sound.  It seems to mirror what GCC does in points-to analysis,
also tracking pointers through integer (parts).  I wonder about
the Unary arithmetic operator restriction - consider 'long p' with single
provenance, then +p has empty provenance.  That's just odd.  I'd
have expected for example -(-p) to have the same provenance as p.
GCC just uses the same provenance as the operand for unary operators.
GCC also tracks provenance through floating-poing values (yes! matlab
generated code passes pointers through two FP parameters...).

The clarification to 6.5.9#6 is most welcome since it matches what
GCC expects (and GCC doesn't implement -fno-provenance - well, I guess
you could use -fno[-ipa]-pta.


> Trap Representations:
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2220.htm#proposed-technical-corrigendum

Not sure what the proposed change to 6.3.2.1p2 means (if there's similar
wording when actually "reading" an uninitialized value elsewhere).  Iff
that makes reads from uninitialized memory well-defined I miss the
definition of said well-defined behavior.

In any case removing the dependence on address-taken or not is good.
Instead of making it explicitely defined maybe remove this sentence.

For 6.3.1.2 I'd say 'the behavior is undefined' instead of
'an unspecified value'.  What's the reason to be not specific here
when one doesn't want 'undefined' behavior?  In particular in the
light of 3.19.3 if the _Bool is a trap representation then this
looks suspicious.

So the idea is to make uninitialized reads return unspecified values
but not invoke undefined behavior.  Offhand I don't know a place
where GCC would take advantage of the difference.  I think for both
we can derive 1 == uninitialized to be true or false statically as we like
but we cannot infer a path to be not taken when it would access an
uninitialized location (I think we don't do that at the moment).

The change doesn't require us to compute x == x to true if x is
uninitialized?

> Unspecified Values:
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2221.htm#proposed-technical-corrigendum

So for 3.4 GCC implements a more strict handling for example treating
a *  as not undefined given there's a value of a (zero)
where the expression is fully defined.  So with 3.4 '0 + '
invokes undefined behavior?  That seems to contradict the proposed
wording - there isn't any value of  where the add invokes
undefined behavior.  More to the point why do they introduce undefined
behavior here rather than an unspecified value?  This means that
 on it's own is fine to use but computing
 +  is not.

And in 3.5 they go on and specify '0 * ' as unspecified?!

So I think they go a bit too far here on several accounts.
I'd like to keep reading from uninitialized memory invoking undefined
behavior - if it's just unspecified how are sanitizers supposed to
handle this given the compiler is now free to schedule such reads to
places where they might not have been executed before.



The changes do not seem to cover effective types and aliasing
as far as I can see (or I missed a non-linked proposed corrigendum).
N doesn't seem to have any.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-21 Thread david at westcontrol dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #37 from David Brown  ---
(In reply to Martin Sebor from comment #35)
> Here are the proposed changes:
> 
> Pointer Provenance:
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2219.htm#proposed-technical-
> corrigendum
> 
> Trap Representations:
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2220.htm#proposed-technical-
> corrigendum
> 
> Unspecified Values:
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2221.htm#proposed-technical-
> corrigendum

I am a little unsure of the suggestions for unspecified values here.  Can I
give some examples, to see if my interpretation is correct?

Let's use a new gcc builtin "__builtin_unspecified()" that returns an
unspecified value of int type, with no possible traps (no gcc target has trap
representations for int types, AFAIK).

int x = __builtin_unspecified();
int y = __builtin_unspecified();

if (x == y) doThis();  // The compiler can skip doThis()
if (x != y) doThat();  // The compiler can skip doThat() too

if (x == y) doThis(); else doThat();  
// The compiler can choose to doThis() or doThat(),
// but must do one or the other

if (x == x) doThis();  // This compiler must doThis()
if (x != x) doThat();  // The compiler cannot doThat()

if (x == 3) doThis();  // The compiler can choose to doThis()
// if the compiler does choose to doThis() the it fixes the value of x as 3

if (x & 0x01) doThis(); else doThat();
// The compiler can choose do doThis() or doThat(),
// but that choice fixes the LSB of x

This could allow for a range of possible optimisations, especially if there is
a nice way to make unspecified values like __builtin_unspecified(). 
(Unspecified values of other types could be made by casts.)  For example:

struct opt_int { bool valid; int value; };
struct opt_int safe_sqrt(struct opt_int x) {
opt_int y;
if (!x.valid || x.value < 0) {
y.valid = false;
y.value = __builtin_unspecified();
} else {
y.valid = true;
y.value = unsafe_sqrt(x.value);
}
return y;
 }

This kind of structure would mean minimal effort when you only need part of a
struct to contain specified values.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-21 Thread david at westcontrol dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #36 from David Brown  ---
(In reply to Martin Sebor from comment #34)

> I think in the use case below:
> 
>struct { int i; char buf[4]; } s, r;
>*(float *)s.buf = 1.;
>r = s;
> 
> the aggregate copy has to be viewed as a recursive copy of each of its
> members and copying buf[4] must be viewed as a memcpy,  Char is definitely
> special (it can accesses anything with impunity, even indeterminate values).
> That said, I don't think the rules allow char arrays to be treated as
> allocated storage so while the store to s.buf via float* may be valid it
> doesn't change the effective type of s.buf and so the only way to read the
> float value stored in it is to copy it byte-by-byte (i.e., copy the float
> representation) to an object whose effective type is float.  Some of the
> papers that deal with the effective type rules might touch on this (e.g., DR
> 236, Clark's N1520

In bare metal embedded development, it is common to have to have a way to treat
static declared storage (like a char[] array) as a pool for dynamic storage. 
Often you don't want to use standard library malloc() because of requirements
on deterministic timing, etc.  What you are saying here is that this is not
possible - meaning there is no way to write such malloc replacement in normal C
code.  (It is possible, I think, to use gcc extensions such as the "may_alias"
type attribute and the "malloc" function attribute.  And -fno-strict-alias is
always a safe resort.)  It would be /very/ nice if there were a way to declare
statically allocated pools of memory that could be doled out by user-made
functions and - like malloc'ed memory - take their effective type when used.

It would be even better if there were a standard way to say that the initial
value of such memory is "unspecified".  The compiler and linker could give such
memory a static allocation (essential for small embedded systems with limited
memory, so that you can be sure of your memory usage) but there would be no
need for zeroing the memory at startup.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-20 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #35 from Martin Sebor  ---
Here are the proposed changes:

Pointer Provenance:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2219.htm#proposed-technical-corrigendum

Trap Representations:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2220.htm#proposed-technical-corrigendum

Unspecified Values:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2221.htm#proposed-technical-corrigendum

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-20 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #34 from Martin Sebor  ---
The questions in N2223 and the other documents are there to provide background
and justification for the proposed changes (the questions come surveys they
sent to various forums).  The proposed words are at the end of each of the
papers referenced from N2223.  I don't have the sense that N2223 covers this
case but it's closely related.

Memcpy and memmove transfer the effective type only to objects with no declared
type (i.e., allocated objects):

  int i = 123;
  void *p = malloc (sizeof i);
  memcpy (p, , sizeof i);   // *p's effective type is now int

This standard mentions just memcpy, memmove, and copies via a character type,
so other mechanisms do not transfer the effective type.  (The effective type of
other (typed) storage is that of its declared type.)  Memory is only allowed to
be accessed via an lvalue compatible with its effective type (or char), so
above, what's at p can only accessed as *(int*)p.

I think in the use case below:

   struct { int i; char buf[4]; } s, r;
   *(float *)s.buf = 1.;
   r = s;

the aggregate copy has to be viewed as a recursive copy of each of its members
and copying buf[4] must be viewed as a memcpy,  Char is definitely special (it
can accesses anything with impunity, even indeterminate values).  That said, I
don't think the rules allow char arrays to be treated as allocated storage so
while the store to s.buf via float* may be valid it doesn't change the
effective type of s.buf and so the only way to read the float value stored in
it is to copy it byte-by-byte (i.e., copy the float representation) to an
object whose effective type is float.  Some of the papers that deal with the
effective type rules might touch on this (e.g., DR 236, Clark's N1520

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #33 from Richard Biener  ---
(In reply to Martin Sebor from comment #30)
> Richard, I offered to write a proposal (with Clark) to improve the rules. 
> With the object model proposals already in the pipeline (N2223) this is a
> good time to review them and see if it makes sense to extend or change them
> to also cover this case in an acceptable way.  It would be helpful if you
> could take some time to summarize your main concerns or suggestions for
> changes in this area.  I can start working on the proposal for the fall 2018
> WG14 meeting.

That's a load of information dangling from
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2223.htm (if I googled that
correctly).  It seems to be mostly
questions and observations plus opinions being raised, sometimes obvious cases
missing and "recent" (> GCC 4.5) GCC behavior not correctly reflected.

I'm not sure in which form I can provide useful feedback here, just to mention
a few misconceptions the authors have from browsing part of the material:

- they say memcpy transfers the effective type, in other parts of the standard
  it says that memcpy accesses use a character type.  That raises the question
  whether a memcpy implementation using a sequence of char reads/writes matches
  this semantics and thus what the effective type of the memory location
written
  to via a character type is -- is it allowed to access that via any other
  effective type (or just the effective type of the source if that is somehow
  visible to the compiler)?

- when asking for a char[] array to be treated the same as allocated storage
  with respect to changing its effective type (declared objects have a fixed
  effective type) they fail to consider the issue that we recently ran into
  with C++:
   struct { int i; char buf[4]; } s, r;
   *(float *)s.buf = 1.;
   r = s;
  how does the aggregate copying work?  The effective type of the access
  is not compatible with the emplaced float which means it doesn't work
  in GCC.  Same if you replace all of the above with allocated storage.
  Does that mean a structure with char members are somehow special?

So overall there's too much material to go over - what's the exact standard
wording changes suggested?  Maybe I just didn't find them ...

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #32 from Richard Biener  ---
(In reply to James Kuyper Jr. from comment #31)
> (In reply to rguent...@suse.de from comment #29)
> > On Thu, 19 Apr 2018, jameskuyper at verizon dot net wrote:
> ...
> > > The relevant wording is "anywhere that a declaration of the completed 
> > > type of
> > > the union is visible.", so it's unambiguously the type, not the object, 
> > > which
> > > must be visible. A declaration of the completed type can be visible 
> > > without any
> > > objects of that type being visible, and that's sufficient for this rule to
> > > apply.
> > > 
> > > > ...  Must the access be performed
> > > > using the union object, or just the union type, or neither?
> > > 
> > > It says only that "it is permitted to inspect the common initial part"; it
> > > imposes no restrictions on how the inspection may be performed. Clearly,
> > > inspection through an lvalue of the union's type must be permitted, but 
> > > it is
> > > also permitted to use the more indirect methods which are the subject of 
> > > this
> > > bug report, simply because the standard says nothing to restrict the 
> > > permission
> > > it grants to the more direct cases.
> > 
> > Note I repeatedly said this part of the standard is just stupid.
> 
> As a judgement call, I reserve the right to disagree with you on that point,
> particularly if that judgement was based primarily on the following
> misconception:
> 
> > ...  It makes
> > most if not all type-based alias analysis useless.
> 
> How could that be true? It only applies to pairs of struct types that are
> the types of members of the same union, it only applies within the scope of
> a completed definition of that union's type, and it doesn't apply if the
> implementation can prove to itself that the two objects in question are not
> actually members of the same union object. It seems to me that the need to
> take this rule into consideration would come up pretty infrequently.

Elsewhere you said that "inspect the common initial part" needs to allow
accesses that do not make the union object visible.  So consider

struct A { int i; float f; };
struct B { int i; double e; };
union { struct A a; struct B b; };

struct B b;
struct A a;

int foo1 (struct A *p)
{
  return p->i; // GCC considers this to not alias 'b', but only 'a'
}

int foo2 (int *p)
{
  return *p; // GCC considers this to alias 'b' and 'a'
}

float foo3 (float *p)
{
  return *p; // GCC considers this to alias only 'a'
}

I guess foo3 is not allowed since 6.5.2.3p6 doesn't allow any type punning
since it only covers the common initial sequence.  We are too strict with
foo1 I guess, with our current implementation I can't see to make that
work without making *p alias 'b' as well.

Now consider

struct C { int i; };
union { struct A a; struct C; };

struct A a;
struct C c;

void foo (struct C *p)
{
  *p = c;
}

do we need to consider this to assign to 'a'?  As far as I read the
standard yes.  We don't handle that, and as I said above this would
mean C and A to have the same alias-set which means now float *
also aliases objects of type B and double * objects of type A.
It doesn't make float * alias double * of course, so it's not all
bets are off but it means that for a program using C++-like
inheritance via unions (like GCC for example does) you mostly
give up on TBAA.

> Code which relies upon this feature to implement a C-style approximation to
> inheritance has been fairly common, which is precisely why the C committee
> decided to create this rule, to make sure such code had well-defined
> behavior.
> 
> > Which means I'll refuse any patches implementing it in a way that affects
> > default behavior.  A clean patch (I really can't think of any clean 
> > approach besides forcing -fno-strict-aliasing!) with some extra flag
> > (well, just use -fno-strict-aliasing ...) would be fine with me.
> 
> I can understanding not making this the default behavior if you feel that
> way; I only use gcc in fully standard-conforming mode, anyway, so that
> doesn't matter to me. However, personally, I would prefer it if gcc's
> fully-conforming mode took full advantage of all the optimization
> opportunities legitimately enabled by 6.5p7 (which does not include
> opportunities revoked by 6.5.2.3p6).

As the examples above show and probably because of the limited TBAA
implementation in GCC honoring 6.5.2.3p6 comes at a cost.  I'm not
sure why 6.5.2.3p6 is necessary given 6.5.2.3p3 points to the footnote
95 which allows to do the inspection under the type-punning umbrella
if the access is made through the union type.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-19 Thread jameskuyper at verizon dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #31 from James Kuyper Jr.  ---
(In reply to rguent...@suse.de from comment #29)
> On Thu, 19 Apr 2018, jameskuyper at verizon dot net wrote:
...
> > The relevant wording is "anywhere that a declaration of the completed type 
> > of
> > the union is visible.", so it's unambiguously the type, not the object, 
> > which
> > must be visible. A declaration of the completed type can be visible without 
> > any
> > objects of that type being visible, and that's sufficient for this rule to
> > apply.
> > 
> > > ...  Must the access be performed
> > > using the union object, or just the union type, or neither?
> > 
> > It says only that "it is permitted to inspect the common initial part"; it
> > imposes no restrictions on how the inspection may be performed. Clearly,
> > inspection through an lvalue of the union's type must be permitted, but it 
> > is
> > also permitted to use the more indirect methods which are the subject of 
> > this
> > bug report, simply because the standard says nothing to restrict the 
> > permission
> > it grants to the more direct cases.
> 
> Note I repeatedly said this part of the standard is just stupid.

As a judgement call, I reserve the right to disagree with you on that point,
particularly if that judgement was based primarily on the following
misconception:

> ...  It makes
> most if not all type-based alias analysis useless.

How could that be true? It only applies to pairs of struct types that are the
types of members of the same union, it only applies within the scope of a
completed definition of that union's type, and it doesn't apply if the
implementation can prove to itself that the two objects in question are not
actually members of the same union object. It seems to me that the need to take
this rule into consideration would come up pretty infrequently.

Code which relies upon this feature to implement a C-style approximation to
inheritance has been fairly common, which is precisely why the C committee
decided to create this rule, to make sure such code had well-defined behavior.

> Which means I'll refuse any patches implementing it in a way that affects
> default behavior.  A clean patch (I really can't think of any clean 
> approach besides forcing -fno-strict-aliasing!) with some extra flag
> (well, just use -fno-strict-aliasing ...) would be fine with me.

I can understanding not making this the default behavior if you feel that way;
I only use gcc in fully standard-conforming mode, anyway, so that doesn't
matter to me. However, personally, I would prefer it if gcc's fully-conforming
mode took full advantage of all the optimization opportunities legitimately
enabled by 6.5p7 (which does not include opportunities revoked by 6.5.2.3p6).

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-19 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #30 from Martin Sebor  ---
Richard, I offered to write a proposal (with Clark) to improve the rules.  With
the object model proposals already in the pipeline (N2223) this is a good time
to review them and see if it makes sense to extend or change them to also cover
this case in an acceptable way.  It would be helpful if you could take some
time to summarize your main concerns or suggestions for changes in this area. 
I can start working on the proposal for the fall 2018 WG14 meeting.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-19 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #29 from rguenther at suse dot de  ---
On Thu, 19 Apr 2018, jameskuyper at verizon dot net wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
> 
> James Kuyper Jr.  changed:
> 
>What|Removed |Added
> 
>  CC||jameskuyper at verizon dot 
> net
> 
> --- Comment #28 from James Kuyper Jr.  ---
> (In reply to Martin Sebor from comment #17)
> > The C Union Visibility rule was intended to cover that case.  The trouble is
> > that the rule tends to be interpreted differently by different people, users
> > and implementers alike: Is it the union object that must be visible at the
> > point of the access, or just the union type?
> 
> The relevant wording is "anywhere that a declaration of the completed type of
> the union is visible.", so it's unambiguously the type, not the object, which
> must be visible. A declaration of the completed type can be visible without 
> any
> objects of that type being visible, and that's sufficient for this rule to
> apply.
> 
> > ...  Must the access be performed
> > using the union object, or just the union type, or neither?
> 
> It says only that "it is permitted to inspect the common initial part"; it
> imposes no restrictions on how the inspection may be performed. Clearly,
> inspection through an lvalue of the union's type must be permitted, but it is
> also permitted to use the more indirect methods which are the subject of this
> bug report, simply because the standard says nothing to restrict the 
> permission
> it grants to the more direct cases.

Note I repeatedly said this part of the standard is just stupid.  It makes
most if not all type-based alias analysis useless.

Which means I'll refuse any patches implementing it in a way that affects
default behavior.  A clean patch (I really can't think of any clean 
approach besides forcing -fno-strict-aliasing!) with some extra flag
(well, just use -fno-strict-aliasing ...) would be fine with me.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2018-04-18 Thread jameskuyper at verizon dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

James Kuyper Jr.  changed:

   What|Removed |Added

 CC||jameskuyper at verizon dot net

--- Comment #28 from James Kuyper Jr.  ---
(In reply to Martin Sebor from comment #17)
> The C Union Visibility rule was intended to cover that case.  The trouble is
> that the rule tends to be interpreted differently by different people, users
> and implementers alike: Is it the union object that must be visible at the
> point of the access, or just the union type?

The relevant wording is "anywhere that a declaration of the completed type of
the union is visible.", so it's unambiguously the type, not the object, which
must be visible. A declaration of the completed type can be visible without any
objects of that type being visible, and that's sufficient for this rule to
apply.

> ...  Must the access be performed
> using the union object, or just the union type, or neither?

It says only that "it is permitted to inspect the common initial part"; it
imposes no restrictions on how the inspection may be performed. Clearly,
inspection through an lvalue of the union's type must be permitted, but it is
also permitted to use the more indirect methods which are the subject of this
bug report, simply because the standard says nothing to restrict the permission
it grants to the more direct cases.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2017-09-27 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #27 from rguenther at suse dot de  ---
On Wed, 27 Sep 2017, david at westcontrol dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
> 
> David Brown  changed:
> 
>What|Removed |Added
> 
>  CC||david at westcontrol dot com
> 
> --- Comment #26 from David Brown  ---
> (In reply to rguent...@suse.de from comment #24)
> > On Wed, 2 Nov 2016, txr at alumni dot caltech.edu wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
> > > 
> > > --- Comment #22 from Tim Rentsch  ---
> > > [responding to comments from rguent...@suse.de in Comment 20]
> > > 
> > > > GCC already implements this if you specify -fno-strict-aliasing.
> > > 
> > > The main point of my comments is that the ISO C standard requires
> > > the behavior in this case (and similar cases) be defined and not
> > > subject to any reordering.  In other words the result must be the
> > > same as an unoptimized version.  If a -fstrict-aliasing gcc /does/
> > > transform the code so that the behavior is not the same as an
> > > unoptimized version, then gcc is not a conforming implementation.
> > 
> > GCC has various optimization options that make it a not strictly
> > conforming implementation (-ffast-math for example), various
> > GNU extensions to the language, etc.
> > 
> > > Or is it your position that gcc is conforming only when operated
> > > in the -fno-strict-aliasing mode?  That position seems contrary to
> > > the documented description of the -fstrict-aliasing option.
> > 
> > Well, N685 is still disputed in this bug.  I was just pointing out
> > that GCC has a switch to make it conforming to your interpretation
> > of the standard (and this switch is the default at -O0 and -O1).
> 
> A key difference with non-conformance options like -ffast-math is that these
> are not default options.  A user must actively choose to use them.  A user
> should not need particular options in order to get correct object code from
> their correct source code - or at least the user should get obvious error
> messages when using default options but where their source code hits an oddity
> in gcc (as they would get if they happened to use a gcc extension keyword like
> "asm" as an identifier in conforming C code).  What should not happen is for
> the compiler to silently break good code unless the user has given specific
> flags.
> 
> I am not sure whether this particular case really is a bug or not.  However, I
> wonder if there has been too much emphasis on trying to understand exactly 
> what
> the standards say.  If the gcc developers here, who are amongst the most
> knowledgeable C and C++ experts around, have trouble with the details - then
> consider the position of the average C developer.  Maybe it is better to try
> see it from their viewpoint - would a programmer expect these accesses to 
> alias
> or not?  If it is likely that programmers would expect aliasing here, and see
> that behaviour in other compilers, then the /useful/ default behaviour for gcc
> would be to treat code in the way programmers expect - even with -O3.  Then
> have a "-fI-know-what-I-am-doing" flag for those that want to squeeze out the
> last bit of performance.

Unfortunately it's not the "last bit of performance", otherwise it would
be indeed a no-brainer.

People expect fast code from a compiler and do not want to enable
dozens of -fIm-writing-reasonable-code.  Some benchmarks even have
rules as to how many options you are allowed to enable...

Richard.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2017-09-27 Thread david at westcontrol dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

David Brown  changed:

   What|Removed |Added

 CC||david at westcontrol dot com

--- Comment #26 from David Brown  ---
(In reply to rguent...@suse.de from comment #24)
> On Wed, 2 Nov 2016, txr at alumni dot caltech.edu wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
> > 
> > --- Comment #22 from Tim Rentsch  ---
> > [responding to comments from rguent...@suse.de in Comment 20]
> > 
> > > GCC already implements this if you specify -fno-strict-aliasing.
> > 
> > The main point of my comments is that the ISO C standard requires
> > the behavior in this case (and similar cases) be defined and not
> > subject to any reordering.  In other words the result must be the
> > same as an unoptimized version.  If a -fstrict-aliasing gcc /does/
> > transform the code so that the behavior is not the same as an
> > unoptimized version, then gcc is not a conforming implementation.
> 
> GCC has various optimization options that make it a not strictly
> conforming implementation (-ffast-math for example), various
> GNU extensions to the language, etc.
> 
> > Or is it your position that gcc is conforming only when operated
> > in the -fno-strict-aliasing mode?  That position seems contrary to
> > the documented description of the -fstrict-aliasing option.
> 
> Well, N685 is still disputed in this bug.  I was just pointing out
> that GCC has a switch to make it conforming to your interpretation
> of the standard (and this switch is the default at -O0 and -O1).

A key difference with non-conformance options like -ffast-math is that these
are not default options.  A user must actively choose to use them.  A user
should not need particular options in order to get correct object code from
their correct source code - or at least the user should get obvious error
messages when using default options but where their source code hits an oddity
in gcc (as they would get if they happened to use a gcc extension keyword like
"asm" as an identifier in conforming C code).  What should not happen is for
the compiler to silently break good code unless the user has given specific
flags.

I am not sure whether this particular case really is a bug or not.  However, I
wonder if there has been too much emphasis on trying to understand exactly what
the standards say.  If the gcc developers here, who are amongst the most
knowledgeable C and C++ experts around, have trouble with the details - then
consider the position of the average C developer.  Maybe it is better to try
see it from their viewpoint - would a programmer expect these accesses to alias
or not?  If it is likely that programmers would expect aliasing here, and see
that behaviour in other compilers, then the /useful/ default behaviour for gcc
would be to treat code in the way programmers expect - even with -O3.  Then
have a "-fI-know-what-I-am-doing" flag for those that want to squeeze out the
last bit of performance.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2016-11-11 Thread davmac at davmac dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

Davin McCall  changed:

   What|Removed |Added

 CC||davmac at davmac dot org

--- Comment #25 from Davin McCall  ---
(In reply to Tim Rentsch from comment #21)
>
> Three:  The "one special guarantee" rule is independent of the
> rules for effective types.  This observation is obviously right
> because effective type rules matter only for access to objects.
> The only objects being accessed under the "one special guarantee"
> rule are guaranteed to have compatible types, which is always
> allowed by effective type rules.

However, the two structures are not compatible types; in a union as per the
example at the head of this PR -

union U {
struct t1 { int m; } s1;
struct t2 { int m; } s2;
};

- the types 'struct t1' and 'struct t2' are not compatible. This line:

p2->m = -p2->m;

- is accessing the active union member (s1) via an incompatible type. From
6.5.2.3:

"A postfix expression followed by the -> operator and an identifier designates
a member of a structure or union object. The value is that of the named member
of the object to which the first expression points"

This makes it IMO clear enough that the structure object is being accessed,
since you can't extract the member of an object without the object existing. If
that were not the case, then you could *always* use p2->m to access the 'm'
member of a 'struct t1' object regardless of the existence of a suitable union
declaration, unless you interpret the "special guarantee" just to mean that the
layout of a C.I.S. need only be identical between two structs if those structs
are part of the same union. In that case however there is no reason for
visibility of the union to matter at point of access, since the struct layout
surely cannot be different in different parts of the program (i.e. if two
structs are visible in a union anywhere and this forces common layout of the
C.I.S, then the C.I.S must have the same common layout throughout the program;
it doesn't make sense to require visibility of the union declaration to make
use of the "special guarantee").

> Four:  The "one special guarantee" rule is related to the area of
> "type punning" through unions, but seen by WG14 as a separate
> issue relative to the general topic.  This is evident from the
> committee response in DR 257.

It's problematic though that the committee response doesn't really follow from
the text. You cannot access the member of one structure via a pointer to
another structure with the same layout, as I have shown above, due to the
aliasing rules. If you (or WG14) are claiming that the "special guarantee" is
not directly concerned with aliasing, then the only way you could make use of
the rule is anyway via type punning, and the only way we can do that without
violating aliasing rules is to go via the union object, at which point the
question of union declaration visibility is moot.

> Five:  The footnote added in C99 TC3 about type punning is seen
> by WG14 not as a change but just as a clarifying comment noting
> what behavior was intended all along.  This is evident from the
> text and response in DR 283.  Note that Clark Nelson, the author
> of this DR, is a long-standing member of WG14, and the suggested
> revision given in the text was adopted verbatim for the TC.

While I agree that this is what WG14 seem to believe, I see no normative part
of the text which supports the footnote, and I see some parts which contradict
it (such as 6.7.2.1 "The value of at most one of the members can be stored in a
union object at any time" / 6.5.2.3 "The value is that of the named member of
the object to which the first expression points" - if the value of only one
member can be stored, how can the value of any other member be defined?).

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2016-11-03 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #24 from rguenther at suse dot de  ---
On Wed, 2 Nov 2016, txr at alumni dot caltech.edu wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
> 
> --- Comment #22 from Tim Rentsch  ---
> [responding to comments from rguent...@suse.de in Comment 20]
> 
> > GCC already implements this if you specify -fno-strict-aliasing.
> 
> The main point of my comments is that the ISO C standard requires
> the behavior in this case (and similar cases) be defined and not
> subject to any reordering.  In other words the result must be the
> same as an unoptimized version.  If a -fstrict-aliasing gcc /does/
> transform the code so that the behavior is not the same as an
> unoptimized version, then gcc is not a conforming implementation.

GCC has various optimization options that make it a not strictly
conforming implementation (-ffast-math for example), various
GNU extensions to the language, etc.

> Or is it your position that gcc is conforming only when operated
> in the -fno-strict-aliasing mode?  That position seems contrary to
> the documented description of the -fstrict-aliasing option.

Well, N685 is still disputed in this bug.  I was just pointing out
that GCC has a switch to make it conforming to your interpretation
of the standard (and this switch is the default at -O0 and -O1).

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2016-11-02 Thread joseph at codesourcery dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #23 from joseph at codesourcery dot com  ---
On Wed, 2 Nov 2016, txr at alumni dot caltech.edu wrote:

> Seven:  Given that the question is now under serious debate, IMO
> someone involved with gcc development should take the initiative
> and responsibility to submit a defect report in order to clarify
> the issue.  Apparently other compilers don't have this problem -

I thought Martin was going to do that (comment#10).

The various DR responses in this area suffer from (a) only deciding 
particular limited cases at most rather than interpreting things more 
generally, and not being very clear about what they decide, and (b) by not 
looking at exactly what the special guarantee is meant to relate to, and 
the different ways that has been interpreted in the past, thereby 
compounding the confusion from that wording having been written and edited 
over time by people who interpreted it in different ways, probably each 
assuming all the other people had interpreted it the same way.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2016-11-02 Thread txr at alumni dot caltech.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #22 from Tim Rentsch  ---
[responding to comments from rguent...@suse.de in Comment 20]

> GCC already implements this if you specify -fno-strict-aliasing.

The main point of my comments is that the ISO C standard requires
the behavior in this case (and similar cases) be defined and not
subject to any reordering.  In other words the result must be the
same as an unoptimized version.  If a -fstrict-aliasing gcc /does/
transform the code so that the behavior is not the same as an
unoptimized version, then gcc is not a conforming implementation.
Or is it your position that gcc is conforming only when operated
in the -fno-strict-aliasing mode?  That position seems contrary to
the documented description of the -fstrict-aliasing option.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2016-11-02 Thread txr at alumni dot caltech.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #21 from Tim Rentsch  ---
[responding to comments from jos...@codesourcery.com in Comment 19]

>> Five:  The answer to the question is clearly No.  The example code
>> is very much on point to the "one special guarantee" clause, and
>> so the read access p1->m is permitted.  As the access is permitted,
>
> I maintain that, as I said in comment#9, the textual history
> indicates that the original intent of saying things are permitted
> here is *only* an exception to the general implementation-defined
> nature of type punning, not to any other reason why things might
> be undefined (such as aliasing rules, data races, etc.).

I went back and read through your earlier comments more carefully.
After that I also reviewed C90, N869, C99, N1124, N1256, DR 236,
DR 257, DR 283, and C11 (in the guise of the just-pre-C11 draft
N1570).

Let me say first that I agree with you that the Semantics section
of the member access operators (. and ->) needs at least some
revision and clarification.

Having said that, let me offer several more detailed responses
and/or comments.

One:  IME later versions of the C standard generally do a better
job of expressing what is intended than earlier versions do.

Two:  The "visible union" condition in C99 was viewed not as a
change to C90 but as correcting an oversight;  it was expected
all along that the union type would be in scope, even if the
expectation was not a conscious one originally.  I am sorry I
don't have a reference handy for this, but one can be found
digging around in the historical documents on the open-std.org
website.

Three:  The "one special guarantee" rule is independent of the
rules for effective types.  This observation is obviously right
because effective type rules matter only for access to objects.
The only objects being accessed under the "one special guarantee"
rule are guaranteed to have compatible types, which is always
allowed by effective type rules.

Four:  The "one special guarantee" rule is related to the area of
"type punning" through unions, but seen by WG14 as a separate
issue relative to the general topic.  This is evident from the
committee response in DR 257.

Five:  The footnote added in C99 TC3 about type punning is seen
by WG14 not as a change but just as a clarifying comment noting
what behavior was intended all along.  This is evident from the
text and response in DR 283.  Note that Clark Nelson, the author
of this DR, is a long-standing member of WG14, and the suggested
revision given in the text was adopted verbatim for the TC.

Six:  A key question here is What is the point or purpose of the
"one special guarantee" rule in the first place?  the Standard
doesn't say, but let me propose two likely motivations.

1. Normally objects may be assumed not to overlap unless
they are accessed through an explicit union membership
expression (or through a character type, etc).  The "one
special guarantee" rule identifies a case where an explicit
union membership expression is not needed.

2. The C standard distinctly allows any amount of padding
between consecutive members of a struct.  Without the "one
special guarantee" rule, there would be no way to be sure
that the offsets of the respective members would match in
all cases.  The "one special guarantee" rule has the effect
of forcing offsets of struct members in a common initial
sequence to be the same.  That is important for code
portability.

Seven:  Given that the question is now under serious debate, IMO
someone involved with gcc development should take the initiative
and responsibility to submit a defect report in order to clarify
the issue.  Apparently other compilers don't have this problem -
only gcc does.

Eight:  In the meantime, the most prudent course of action is to
fix gcc so that it does not reorder code in cases like the above.
Whenever there is any doubt, the only sensible choice is to err
on the side of caution, and not perform any code transformations
that might not be allowed in a conforming implementation.  (Of
course it would be okay to perform such transformations under
some non-default compiler option, as long as it is not in force
unless explicitly requested, and clearly flagged as possibly
non-conforming.)

Nine:  Doing a final review, I realized I have not yet responded
directly to your last comment.  I agree with your general
sentiment that the "one special guarantee" rule is not meant as
a "super rule" that trumps all other possible reasons for
undefined behavior.  However, I do not agree with your primary
point that it is meant to be limited to the "type punning" area.
The example I previously mentioned in the C standard, and the
committee discussion in DR 257, both show that there are other
factors involved here beyond just those related to type punning.

I hope the above has helped clarify the matter.  I look forward
to reading your responding 

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2016-11-01 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #20 from rguenther at suse dot de  ---
On November 1, 2016 7:16:06 PM GMT+01:00, "txr at alumni dot caltech.edu"
 wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
>
>Tim Rentsch  changed:
>
>   What|Removed |Added
>
>  CC||txr at alumni dot caltech.edu
>
>--- Comment #18 from Tim Rentsch  ---
>I would like to add a few comments to the discussion.
>
>One:  C and C++ are different in how they treat unions.  My comments
>here are about C.  I believe they apply to C++ as well, but I am not
>as familiar with the C++ standard as the C standard, so please take
>that into consideration.
>
>Two:  I have recently posted a comment for Bug 14319.  That comment
>explains my reasoning why these two bugs should be separated and
>not be considered duplicates.
>
>Three:  I note the comments made by joseph with regard to the .s1/.s2
>matter.  There may be a larger open question there, but to avoid
>muddying the waters please assume that his change to use .s2 in the
>initializer has been made.
>
>Four:  I understand that there are also larger issues related to how
>union membership may have a bearing on alias analysis.  My comments
>here are confined to the particular case at hand, namely, given a
>definition for union U followed by a definition for function f(),
>could f() be optimized so the p1->m value is cached in a register
>(or something similar) before the body of the if() is executed,
>and the cached value used as the return value.
>
>Five:  The answer to the question is clearly No.  The example code
>is very much on point to the "one special guarantee" clause, and
>so the read access p1->m is permitted.  As the access is permitted,
>and as there are no other conditions present that cause undefined
>or unspecified behavior, the behavior is well-defined, which means
>any optimization that changes the unoptimized behavior is wrong.
>
>Six:  To see the example code is covered under the "one special
>guarantee" clause, note the second part of EXAMPLE 3 in 6.5.2.3.
>In particular, the commentary in parentheses, "(because the union
>type is not visible within function f)", shows that whether the
>union type is defined before or after f() is the determining
>factor here.  Whether a . or -> union membership operation is
>present or not present has no bearing on the definedness of
>the struct member access p1->m.
>
>Seven:  I understand the objections about impacting alias analysis
>and so forth.  I agree that it makes the analysis more difficult
>(although not as sweeping in its implications as some comments
>imply).  Despite the problems, the examples in the Standard, and
>also the response to DR 257, both show that the committee members
>fully intend that this case be covered under the "one special
>guarantee" clause.
>
>Eight:  In the meantime, I strongly recommend gcc be patched to
>support the expected decision (which is the more conservative
>choice) rather than suspending activity until some indefinite
>time in the future.

GCC already implements this if you specify -fno-strict-aliasing.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2016-11-01 Thread joseph at codesourcery dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #19 from joseph at codesourcery dot com  ---
On Tue, 1 Nov 2016, txr at alumni dot caltech.edu wrote:

> Five:  The answer to the question is clearly No.  The example code
> is very much on point to the "one special guarantee" clause, and
> so the read access p1->m is permitted.  As the access is permitted,

I maintain that, as I said in comment#9, the textual history indicates 
that the original intent of saying things are permitted here is *only* an 
exception to the general implementation-defined nature of type punning, 
not to any other reason why things might be undefined (such as aliasing 
rules, data races, etc.).

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2016-11-01 Thread txr at alumni dot caltech.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

Tim Rentsch  changed:

   What|Removed |Added

 CC||txr at alumni dot caltech.edu

--- Comment #18 from Tim Rentsch  ---
I would like to add a few comments to the discussion.

One:  C and C++ are different in how they treat unions.  My comments
here are about C.  I believe they apply to C++ as well, but I am not
as familiar with the C++ standard as the C standard, so please take
that into consideration.

Two:  I have recently posted a comment for Bug 14319.  That comment
explains my reasoning why these two bugs should be separated and
not be considered duplicates.

Three:  I note the comments made by joseph with regard to the .s1/.s2
matter.  There may be a larger open question there, but to avoid
muddying the waters please assume that his change to use .s2 in the
initializer has been made.

Four:  I understand that there are also larger issues related to how
union membership may have a bearing on alias analysis.  My comments
here are confined to the particular case at hand, namely, given a
definition for union U followed by a definition for function f(),
could f() be optimized so the p1->m value is cached in a register
(or something similar) before the body of the if() is executed,
and the cached value used as the return value.

Five:  The answer to the question is clearly No.  The example code
is very much on point to the "one special guarantee" clause, and
so the read access p1->m is permitted.  As the access is permitted,
and as there are no other conditions present that cause undefined
or unspecified behavior, the behavior is well-defined, which means
any optimization that changes the unoptimized behavior is wrong.

Six:  To see the example code is covered under the "one special
guarantee" clause, note the second part of EXAMPLE 3 in 6.5.2.3.
In particular, the commentary in parentheses, "(because the union
type is not visible within function f)", shows that whether the
union type is defined before or after f() is the determining
factor here.  Whether a . or -> union membership operation is
present or not present has no bearing on the definedness of
the struct member access p1->m.

Seven:  I understand the objections about impacting alias analysis
and so forth.  I agree that it makes the analysis more difficult
(although not as sweeping in its implications as some comments
imply).  Despite the problems, the examples in the Standard, and
also the response to DR 257, both show that the committee members
fully intend that this case be covered under the "one special
guarantee" clause.

Eight:  In the meantime, I strongly recommend gcc be patched to
support the expected decision (which is the more conservative
choice) rather than suspending activity until some indefinite
time in the future.

[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-09-09 Thread joseph at codesourcery dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #14 from joseph at codesourcery dot com  ---
That C++ wording doesn't have any obvious bearing on what "it is 
permitted" is intended to be an exception to - the general 
implementation-defined nature of type punning (which I think was the 
original intent in C90), or the aliasing rules.


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-09-09 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #15 from Jonathan Wakely  ---
(In reply to Melissa from comment #12)
> A C++ conversion of the original example is below.  I asked about the word
> "read" on the C++ Standard Discussion (std-discussion) mailing list, because
> it probably should also allow writing if it allows reads.

Up to C++14 the wording said "inspect" which was changed to use "read" by
http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1719 so I think
limiting to reads and not writes is intended.

(In reply to jos...@codesourcery.com from comment #14)
> That C++ wording doesn't have any obvious bearing on what "it is 
> permitted" is intended to be an exception to - the general 
> implementation-defined nature of type punning (which I think was the 
> original intent in C90), or the aliasing rules.

C++ doesn't support any type-punning, only reading from the common initial
sequence (where the types must be compatible), so I think it can only be an
exception to the aliasing rules.


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-09-09 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #16 from Jonathan Wakely  ---
Message [c++std-core-20893] on the C++ core reflector on 2011-12-14 supports
the GCC view that a C++ compiler can apply strict aliasing rules to p1->m and
p2->m unless the fact they come from the same object is visible to the
compiler.


| Either I'm mistaken, or it is a 'magical property of unions'. If the
| compiler can figure out two objects didn't come form the same union,
| it can assume they do not alias,

Yes.  I believe I filled a core issue regarding this -- coming from
controversy originating from optimizations performed by compilers
today (especially without full program analysis) -- and the resolution
is that if intend to play type punning games with X and Y you better let
the type checker and optimizer see your cards upfront.  Otherwise, you
are on your own.  I can't remember the issue number right now.  In the
end, I think that was the right decision.


But I can't find the core issue referred to.


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-09-09 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #17 from Martin Sebor  ---
The C Union Visibility rule was intended to cover that case.  The trouble is
that the rule tends to be interpreted differently by different people, users
and implementers alike: Is it the union object that must be visible at the
point of the access, or just the union type?  Must the access be performed
using the union object, or just the union type, or neither?  There are
implementations that apparently disable aliasing at the first sight of a union
in a translation unit.  There are others that only disable it for structs used
in a union.  And others still that don't do anything special unless the aliased
object is accessed through the union itself.

There are also aliasing problems beyond unions that affect both languages. 
WG14 N1520 referenced in comment #10 gives a few scary examples.  This still
needs to be resolved.  (I haven't talked to Clark since Lenexa so unless he's
made progress on the issue the new paper I mentioned in comment #10 will have
to wait until 2016.)


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-09-08 Thread myriachan at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

Melissa  changed:

   What|Removed |Added

 CC||myriachan at gmail dot com

--- Comment #12 from Melissa  ---
This is broken in C++ as well, and in C++, the rules are much more clear that
GCC isn't following them.

Quoting the C++ Standard, revision 4296 (post-C++14?):

16. The "common initial sequence" of two standard-layout struct (Clause 9)
types is the longest sequence of non-static data members and bit-fields in
declaration order, starting with the first such entity in each of the structs,
such that the corresponding entries have layout-compatible types and either
neither entity is a bit-field or both are bit-fields with the same width.

19. In a standard-layout union with an active member (9.5) of struct type T1,
it is permitted to read a non-static data member m of another union member of
struct type T2 provided m is part of the common initial sequence of T1 and T2.


A C++ conversion of the original example is below.  I asked about the word
"read" on the C++ Standard Discussion (std-discussion) mailing list, because it
probably should also allow writing if it allows reads.  As a result, I modified
the below to only *read* in an aliasing way, to fully comply with the written
word of the Standard.


#include 

struct t1 { int m; };
struct t2 { int m; };

union U {
t1 s1;
t2 s2;
};

int f (t1 *p1, t2 *p2)
{
// union U visible here, p1->m and p2->m may alias
// p1 is the active member; read from p2 per [class.mem]/19.

if (p2->m < 0)
p1->m = -p1->m;

return p2->m;
}

int main (void)
{
union U u = { { -1 } };

int n = f (, );

assert (1 == n);

return 0;
}


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-09-08 Thread myriachan at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #13 from Melissa  ---
As for a reason why this should be allowed, all I need is to do is mention
struct sockaddr.


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-05-21 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |SUSPENDED
   Last reconfirmed||2015-05-21
 CC||mpolacek at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #11 from Marek Polacek mpolacek at gcc dot gnu.org ---
Suspending until then.


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-05-13 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #10 from Martin Sebor msebor at gcc dot gnu.org ---
Thanks (again) for your comments, Joseph.  I had a chance to discuss this issue
with Clark Nelson last week.  Clark has worked on improving the aliasing parts
of the C specification in the past, for example in N1520
(http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1520.htm).  He agreed that
like the issues pointed out in N1520, this is also an outstanding problem that
would be worth for WG14 to revisit and fix.  We also agreed to work together on
a revised paper for the next WG14 meeting in October 2015.


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-04-28 Thread joseph at codesourcery dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #9 from joseph at codesourcery dot com joseph at codesourcery dot 
com ---
The rule certainly has nothing to do with whether the struct types are 
defined inside the union definition, or defined outside and then used 
inside via a tag or typedef.

The it is permitted wording is poorly defined, and the C99 changes 
confused this through failing to realise how it's poorly defined.  In C90, 
the paragraph starts With one exception, if a member of a union object is 
accessed after a value has been stored in a different member of the 
object, the behavior is implementation-defined.[41]  One special guarantee 
is made ... it is permitted   This reads to me like it is permitted 
was intended as an exception to the general rule of behavior being 
implementation-defined (that is, it was defining one case of type punning, 
which was more generally defined non-normatively in the footnote added in 
C99 TC3).

The difficulty with it is permitted is there are any number of cases 
where other wording indicates something is not permitted that it could be 
interpreted as overriding - just saying it is permitted fails to say 
which are or are not overridden.  (As a C11 example, something might not 
be permitted because it's a data race, for example, but accessing a common 
initial sequence can hardly be intended to override the normal rules about 
data races.)  So the authors of N685 read it is permitted as relating 
not to the previous sentence in the same paragraph about accessing 
different union members, but as relating to completely separate rules 
about aliasing.  The visible union rule was then inserted for C99, thereby 
serving to confuse things further by supporting the suggestion that it is 
permitted relates to aliasing.

DR#236 then considered a different case of aliasing through pointers to 
union members.  However, the response never decided the question of 
whether the accesses must visibly be through the union, or whether it's 
sufficient for the declaration of the union to be visible.

Basing things on whether a union is visible in the translation unit is 
clearly a bad rule because of action-at-a-distance effects (the visible 
union might be in a header included for some completely unrelated purpose, 
but would still inhibit optimization).

(Note that the exact example given in this bug is invalid as the union has 
active member s1, but is modified via member s2; you can only inspect 
common members of non-active union members, not modify them.  But 
presumably using .s2 in the initializer would still show the issue.)

Thus, it is permitted needs reworking to describe what it's an exception 
to.  To the extent that it's an exception to aliasing rules, I think that 
should only be where the union is actually used for the accesses in 
question (in which case no exception is actually needed beyond defining 
the layout requirements and making the type punning rules normative), and 
DR#236 should be clarified accordingly.


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-04-27 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Keywords||alias
 CC||rguenth at gcc dot gnu.org

--- Comment #7 from Richard Biener rguenth at gcc dot gnu.org ---
As Andrew says GCCs behavior is intentional and N685 is just completely broken.


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-04-27 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #8 from Martin Sebor msebor at gcc dot gnu.org ---
If one of you can explain the problem with it I'm willing to write up a paper
and submit it to WG14 and request to have the standard changed.


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-04-26 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

Martin Sebor msebor at gcc dot gnu.org changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|DUPLICATE   |---

--- Comment #4 from Martin Sebor msebor at gcc dot gnu.org ---
Thanks for the pointer!  I had looked for a related bug report but couldn't
find it.

There's an important difference between the test cases in pr14319 and the one
here that's easy to overlook.  The rule only applies to structs defined in
unions, not those defined at file scope and only used to declare union members,
and to translation units in which the union definition is visible.  I would
recommend closing pr14319 as NOTABUG.  I have reopened this bug.


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-04-26 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #6 from Martin Sebor msebor at gcc dot gnu.org ---
I agree it's subtle and could be clearer but I believe the key phrase is a
union contains several structures.  Here, the term union refers to the type,
not the object.  This is supported by the use of the term union object in the
second part of the sentence.

This interpretation is in line with Derek Jones' excellent The New C Standard
-- An Economic and Cultural Commentary: http://www.coding-guidelines.com/cbook

But if there's doubt that this interpretation is intended I'd be happy to raise
an interpretation request on the WG14 mailing list.


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-04-26 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #5 from Andrew Pinski pinskia at gcc dot gnu.org ---
(In reply to Martin Sebor from comment #4)
 Thanks for the pointer!  I had looked for a related bug report but couldn't
 find it.
 
 There's an important difference between the test cases in pr14319 and the
 one here that's easy to overlook.  The rule only applies to structs defined
 in unions, not those defined at file scope and only used to declare union
 members, and to translation units in which the union definition is visible. 


No that is not the rule.  If I read the section:
[#5] One special guarantee is made in order to simplify the use of unions:
if a union contains several structures that share a common initial sequence
(see
below), and if the union object currently contains one of these structures, it
is permitted to inspect the common initial part of any of them anywhere that a
declaration of the complete type of the union is visible. Two structures share
a
common initial sequence if corresponding members have compatible types (and,
for
bit-fields, the same widths) for a sequence of one or more initial members.
--- CUT 
I don't see anywhere the standard says about where the struct is defined in the
statement above.


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-04-25 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Andrew Pinski pinskia at gcc dot gnu.org ---
This is an exact dup of bug 14319.

*** This bug has been marked as a duplicate of bug 14319 ***


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-04-25 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #3 from Andrew Pinski pinskia at gcc dot gnu.org ---
Also if we follow that defect resolution, basically strict aliasing does not
mean anything any more and we would have to turn off strict aliasing for all
structs.


[Bug c/65892] gcc fails to implement N685 aliasing of union members

2015-04-25 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

--- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org ---
Some folks think that resolution is not fully correct.