https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95384

--- Comment #4 from Barry Revzin <barry.revzin at gmail dot com> ---
Here's another example of the same kind of issue
(https://godbolt.org/z/KWr9rMssj):

template <class T, class U>
struct tagged_union {
    tagged_union(T t) : index(0), a(t) { }
    tagged_union(U u) : index(1), b(u) { }


    union {
        T a;
        U b;
    };
    char index;
};

struct X { int i; };
struct Y { int j; };

tagged_union<X, Y> as_tagged_union(X x) {
    return x;
}

template <typename T, typename U>
struct tagged_union_wrapped : tagged_union<T, U> {
    using tagged_union<T, U>::tagged_union;
};

auto as_tagged_union2(X x) {
    return tagged_union_wrapped<X, Y>(x);
}

this on -O3 emits:

as_tagged_union(X):
        mov     eax, edi
        ret
as_tagged_union2(X):
        mov     DWORD PTR [rsp-8], edi
        mov     BYTE PTR [rsp-4], 0
        mov     rax, QWORD PTR [rsp-8]
        ret

If you change the index member from 'char' to 'int', causing the tail padding
to disappear, as_tagged_union2 improves to the same code gen as
as_tagged_union.

This is relevant for std::variant performance. std::variant<X, Y> behaves like
tagged_union_wrapped<X, Y>, whereas if you drop down to the implementation
details and directly use _Variant_storage_alias<X, Y>, that behaves like
tagged_union<X, Y> for these purposes.

Reply via email to