----- Original Message -----

> From: "Daniel Berlin" <[email protected]>
> To: "Jeroen Dobbelaere" <[email protected]>
> Cc: "Hal Finkel" <[email protected]>,
> [email protected],
> [email protected], "[email protected] Developers"
> <[email protected]>
> Sent: Wednesday, March 18, 2015 10:27:28 AM
> Subject: Re: [PATCH] Fix for bug 21725: wrong results with union and
> strict-aliasing

> On Wed, Mar 18, 2015 at 2:22 AM, Jeroen Dobbelaere <
> [email protected] > wrote:

> > On Tue, Mar 17, 2015 at 11:55 PM, Daniel Berlin <
> > [email protected]
> > > wrote:
> 

> > > [..]
> > 
> 

> > > In theory, the last time i remember, you weren't allow to set one
> > > member of a union and read another.
> > 
> 
> > > But uh, that's not real user code :)
> > 
> 

> > > (and IIRC, it does not say anything real)
> > 
> 

> > This is indeed correct. The main issue is that llvm thinks it is
> > allowed to reorder 'stores' (as it thinks they are not aliasing),
> 
> > so that a legal read afterwards will result in a wrong answer (if
> > an
> > optimization or the backend reorders the stores based on the wrong
> > aliasing information).
> 

> I thought both LLVM and GCC guaranteed type punning through unions
> would work?
Yes, LLVM does this (at least for "local" type punning, for some heuristic 
definition of local), because we always allow BasicAA to override TBAA when 
BasicAA is *sure* there is aliasing. 

> > > > If you access a member (or nested member) of a union, starting
> > > > from
> > > > the union itself, then it depends if the other type is also
> > > > accessible through the union.
> > > 
> > 
> 

> > > > So:
> > > 
> > 
> 

> > > > int foo(union foo* a, float* b, int* c) {
> > > 
> > 
> 

> > > > a->a=1;
> > > 
> > 
> 

> > > > *b=2;
> > > 
> > 
> 

> > > > // compiler must assume a->a and *b can alias
> > > 
> > 
> 

> > > > // compiler must not assume *b and *c alias (access not through
> > > > union)
> > > 
> > 
> 

> > > > }
> > > 
> > 
> 

> > > > (Also see section 3.10 of the c++03 standard;
> > > 
> > 
> 
> > > This, IMHO, does not say what you seem to think it does :)
> > 
> 

> > > For C++03, 3.10 only includes the word "union" here: "If a
> > > program
> > > attempts to access the stored value of an object through an
> > > lvalue
> > > of other than one of the following types the behavior is
> > > undefined:
> > 
> 

> > > — the dynamic type of the object,
> > 
> 
> > > — a cv-qualified version of the dynamic type of the object,
> > 
> 
> > > — a type that is the signed or unsigned type corresponding to the
> > > dynamic type of the object,
> > 
> 
> > > — a type that is the signed or unsigned type corresponding to a
> > > cv-qualified version of the dynamic type of the object,
> > 
> 
> > > — an aggregate or union type that includes one of the
> > > aforementioned
> > > types among its members (including, recursively, a member of a
> > > subaggregate or contained union),
> > 
> 
> > > — a type that is a (possibly cv-qualified) base class type of the
> > > dynamic type of the object,
> > 
> 
> > > — a char or unsigned char type."
> > 
> 

> > > C++ standard experts, at least on the GCC side, did not view this
> > > as
> > > saying "all accesses must have an explicit union access", but
> > > that
> > > "It must be part of a union type", but about whether you try to
> > > access it through a union that doesn't have the right actual
> > > types
> > > in it.
> > 
> 

> > > The type of those objects is right the type of the object. There
> > > is,
> > > IMHO, nothing illegal about those accesses.
> > 
> 

> > So, with that interpretation, the mere presence of a 'union { short
> > s; int i; }' (in the same or a different compilation unit) should
> > influence the (type based) aliasing of all short and int pointers ?
> 

> It's actually worse than this interpretation, since someone pointed
> out on the gcc side that "ihateunions" could be in a different TU,
> and you'd never see the union at all.

> While I think this is the right interpretation (as do others), that's
> not the point.
> I'm actually on your side here :)

> But remember that Hal said that if hte language has semantics, we
> should follow them.
I did. It is also true that standards have defects ;) If a strict reading of 
the language semantics implies a 'spooky action at a distance' effect, then 
we'll need to draw a line in the sand somewhere to avoid that. I also think 
your interpretation that implies that is incorrect... 

> The whole point of this is essentially to point out that the language
> semantics weren't completely thought out here :)
> Because of that, our options are either "make up a rule people can
> follow" or "disable TBAA".

> GCC, and every other compiler i'm aware of, has chosen the former, at
> least for C++03 (I don't know what C++11 or 14 say here).
Yes; fair enough. What I want is that if we make up a rule, then we make that 
rule well defined and documented (not just whatever the implementation happens 
to do today); and we should also write a paper for the C++ Standards Committee 
proposing that as an official solution (I'll be happy to present at the next 
meeting, and we have approximately 1 month before the next paper deadline, 
IIRC). 

In the current draft, I relevant wording looks equivalent: 

— an aggregate or union type that includes one of the aforementioned types 
among its elements 
or non-static data members (including, recursively, an element or non-static 
data member of 
a subaggregate or contained union), 

However, another point to consider is that, " C++ standard experts, at least on 
the GCC side, did not view this as saying "all accesses must have an explicit 
union access", but that..." may not be a reasonable reading of the standard. 
The text says, "If a program attempts to access the stored value of an object 
through a glvalue of other than one 
of the following types the behavior is undefined: ...", and so I read that as 
saying that the access must be through a glvalue of an aggregate or union type 
(with a relevant type as one of its members). So the union or aggregate type 
*must* be explicitly present in the access. I don't believe any other 
interpretations, especially those which imply non-local effects, are 
reasonable. 

Exactly what rule did GCC implement? 

Thanks again, 
Hal 

> > This doesn't look a practical solution to me :(
> 

> > That's why I settled on the need of having the full access path
> > available as a practical solution (and one that can easily be
> > explained)..
> 

> > (btw. Do you have a pointer to the discussion on the gcc side ?)
> 

> It's about 7-8 years old at this point. Let me see if i can find it
> > What we currently have is definitely wrong. Question is how we want
> > to fix it...
> 

> So, we have multiple wrongs here.
> The first is that the short doesn't appear in the union info. Given
> that it's a member of the union, it definitely should appear in the
> info for that union.

> I'd start there, and see how far that gets us :)

> > Greetings,
> 

> > Jeroen Dobbelaere
> 

-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to