https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101061

--- Comment #11 from rguenther at suse dot de <rguenther at suse dot de> ---
On Wed, 16 Jun 2021, alexander.gr...@tu-dresden.de wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101061
> 
> --- Comment #9 from Alexander Grund <alexander.gr...@tu-dresden.de> ---
> > Note that when the union type is visible in the access path then GCC allows 
> > punning without further restrictions. Thus the accesses as written above 
> > are OK.
> 
> Now I have to ask again for clarification because this seemingly contradicts
> your previous statement. So let me try to state the previous question again:
> 
> Given a pointer `slot_type* slot` (where slot_type is the union as defined
> before) is it legal to access `slot->value.first`, 
> `slot->mutable_value.first`,
> `slot->key` interchangeably?
> 
> In all 3 accesses the union is visible in the access path (if I understood 
> that
> correctly) so the type punning should be ok, shouldn't it?

Yes - but it routinely happens that in C++ you end up returning a
reference to the union member as abstraction and an access to that 
reference does _not_ have the union visible but only the member.

> > the circumstances have been so there's incentive to do an invalid 
> > transform... 
> 
> So is this indeed a GCC bug which may be fixed or gone latent?

I don't think so, but we cannot rule this out at this point (but see 
above).

> Otherwise, maybe the construction is the problem? What Abseil basically 
> does is allocating a (suitably aligned) buffer of chars and in-place 
> constructing those slot_type unions/pairs there as needed (i.e. similar 
> to std::vector) After abstractions what basically happens is:
> 
> // alloc buffer done, then:
> slot_type* slot_buffer = reinterpret_cast<slot_type*>(buffer);
> 
> // on emplace:
> size_t x = ...;
> slot_type* slot = slot_buffer + x;
> new(slot) slot_type;
> new(&slot->mutable_value) pair<const K, V>(k, v);
> slot_type* slot2 = slot_buffer + x; // same as before, actually done through
> the iterator
> idx_vec[i] = slot2->value.second;

so if slot_type is the union type then

  new(&slot->mutable_value) pair<const K, V>(k, v)

looks problematic since that calls operator new with a pointer to the
union member and the actual construction receives &slot->mutable_value
as address of type decltype (slot->mutable_value) * which falls foul
of the union punning rule.

I'm not sure how one can solve this issue with using placement new
but are unions not sufficiently restricted so that copy assignment
should work (and activate the appropriate union member)?  Thus

  slot->mutable_value = pair<const K, V>(k, v);

?  The compiler should hopefully be able to elide the copy.

> My suspicion is, that GCC loads the value of slot2 before constructing the
> object, at least sometimes. Printing the values in the vector often shows a 
> run
> of zeroes before some other (potentially correct) values. I'd guess the 
> correct
> values happen, when no insertion took place, i.e. the construction was done
> already in a prior loop iteration.

Yes, GCC would simply "skip" the offending placement new and if it finds
a suitable definition before it would replace the load with said 
definition.

To expand on the placement new issue if you for example have

struct Y { Y(int); };
union X { int i; float f; };

void foo (void *p)
{
  X *q = reinterpret_cast <X *> (p);
  new (&q->i) Y (1);
}

we end up with (early unoptimized IL):

void __GIMPLE (void * p)
{
  void * D_6558;
  void * D_6557;
  union X * q;

  q = p;
  D_6558 = &q->i;
  D_6557 = operator new (1ul, D_6558);
  try
    {
      Y::Y (D_6571, 1);
    }
  catch
    {
      operator delete (D_6557, D_6558);
    }
}

where 'operator new' is the usual noop that just returns the passed
pointer here.  But what you can see is that the resulting pointer
is used for the initialization and not the placement address as
literally written in source.  And even then, the CTOR that is
involved of course sees only 'this' which is a plain pointer to
its class type and all accesses in the CTOR will not have the
union visible.

That might be unexpected but I think it's a quite natural
consequence of C++ abstraction :/

Reply via email to