https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95384
--- Comment #4 from Barry Revzin <barry.revzin at gmail dot com> --- Here's another example of the same kind of issue (https://godbolt.org/z/KWr9rMssj): template <class T, class U> struct tagged_union { tagged_union(T t) : index(0), a(t) { } tagged_union(U u) : index(1), b(u) { } union { T a; U b; }; char index; }; struct X { int i; }; struct Y { int j; }; tagged_union<X, Y> as_tagged_union(X x) { return x; } template <typename T, typename U> struct tagged_union_wrapped : tagged_union<T, U> { using tagged_union<T, U>::tagged_union; }; auto as_tagged_union2(X x) { return tagged_union_wrapped<X, Y>(x); } this on -O3 emits: as_tagged_union(X): mov eax, edi ret as_tagged_union2(X): mov DWORD PTR [rsp-8], edi mov BYTE PTR [rsp-4], 0 mov rax, QWORD PTR [rsp-8] ret If you change the index member from 'char' to 'int', causing the tail padding to disappear, as_tagged_union2 improves to the same code gen as as_tagged_union. This is relevant for std::variant performance. std::variant<X, Y> behaves like tagged_union_wrapped<X, Y>, whereas if you drop down to the implementation details and directly use _Variant_storage_alias<X, Y>, that behaves like tagged_union<X, Y> for these purposes.