[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #44 from Andrew Downing --- (In reply to Richard Biener from comment #43) > (In reply to Andrew Downing from comment #41) > > > Thus for types without a non-trivial ctor/dtor you do not need to use > > > placement new. So take your example and remove the placement new. > > > Does that change its semantics? > > > > These are C++17 rules. > > > > 4.5/1) An object is created by a definition, by a new-expression, when > > implicitly changing the active member of a union, or when a temporary object > > is created. > > > > 6.8/1) The lifetime of an object of type T begins when: storage with the > > proper alignment and size for type T is obtained, and if the object has > > non-vacuous initialization, its initialization is complete. > > > > double d; > > > > My interpretation of the above rules would be that only a double object is > > created in the storage for d because T in 6.8/1 is set to double by the > > definition of d. According to these rules the only way to change the dynamic > > type of the object in d's storage would be with placement new (pre C++20). > > memcpy only overwrites the object representation. It doesn't affect it's > > type or lifetime. > > What would > > *(long *) = 1; > > do? My reading of earlier standards say it starts lifetime of a new object > of type long (the storage of 'd' gets reused). Following that stmt a read > like > > foo (d); > > invokes undefined behavior (it accesses the storage of effective type long > via an effective type of double). The same example with placement new > would be > > *(new () long) = 1; > > and I'm arguing the placement new is not required to start the lifetime > of an object of type long in the storage of 'd'. It's been a while since I've though about this stuff. double d; *(long *) = 1; That would be lead to undefined behavior because it's breaking the strict aliasing rules. *(new () long) = 1; That would be ok because new creates a long object in the storage of d before dereferencing.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #43 from Richard Biener --- (In reply to Andrew Downing from comment #41) > > Thus for types without a non-trivial ctor/dtor you do not need to use > > placement new. So take your example and remove the placement new. > > Does that change its semantics? > > These are C++17 rules. > > 4.5/1) An object is created by a definition, by a new-expression, when > implicitly changing the active member of a union, or when a temporary object > is created. > > 6.8/1) The lifetime of an object of type T begins when: storage with the > proper alignment and size for type T is obtained, and if the object has > non-vacuous initialization, its initialization is complete. > > double d; > > My interpretation of the above rules would be that only a double object is > created in the storage for d because T in 6.8/1 is set to double by the > definition of d. According to these rules the only way to change the dynamic > type of the object in d's storage would be with placement new (pre C++20). > memcpy only overwrites the object representation. It doesn't affect it's > type or lifetime. What would *(long *) = 1; do? My reading of earlier standards say it starts lifetime of a new object of type long (the storage of 'd' gets reused). Following that stmt a read like foo (d); invokes undefined behavior (it accesses the storage of effective type long via an effective type of double). The same example with placement new would be *(new () long) = 1; and I'm arguing the placement new is not required to start the lifetime of an object of type long in the storage of 'd'.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #42 from Richard Biener --- See PR101641 for an interesting case where eliding a round-trip causes wrong-code generation. It's union related so might not apply 1:1 to C++.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #41 from Andrew Downing --- > Thus for types without a non-trivial ctor/dtor you do not need to use > placement new. So take your example and remove the placement new. > Does that change its semantics? These are C++17 rules. 4.5/1) An object is created by a definition, by a new-expression, when implicitly changing the active member of a union, or when a temporary object is created. 6.8/1) The lifetime of an object of type T begins when: storage with the proper alignment and size for type T is obtained, and if the object has non-vacuous initialization, its initialization is complete. double d; My interpretation of the above rules would be that only a double object is created in the storage for d because T in 6.8/1 is set to double by the definition of d. According to these rules the only way to change the dynamic type of the object in d's storage would be with placement new (pre C++20). memcpy only overwrites the object representation. It doesn't affect it's type or lifetime. If you remove the placement new from my example, the program has undefined behavior because it later accesses the double object with a uint64_t pointer. With placement new in place, it accesses a uint64_t object in d's storage with a uint64_t pointer. In C++20 the placement new wouldn't be required because in addition to the things above that create objects, you also have operations that implicitly create objects of which memcpy is one. The rules are different than C's though. The only objects that are or are not created are ones that would give the program defined behavior. So in f1 where the uint64_t* pointing to d is accessed, the compiler would have to make the second memcpy create a uint64_t object in d's storage before copying the bytes to give the subsequent access through a uint64_t* defined behavior. If you used placement new anyway, then the second memcpy wouldn't have to create an object because the program would already have defined behavior.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #40 from rguenther at suse dot de --- On Mon, 15 Jun 2020, richard-gccbugzilla at metafoo dot co.uk wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 > > --- Comment #37 from Richard Smith > --- > (In reply to Richard Biener from comment #36) > > The main issue I see is that this differing expectations of C and C++ are > > impossible to get correct at the same time. > > That is a rather bold claim. I think you can satisfy both rule sets by using > the C++ rule even in C. It is conservatively correct to discard the effective > / > dynamic type when you see a memcpy, and the C++ semantics require you to do > so. > > The C semantics also appear to require the same thing, if you cannot track the > destination back to either an object with a declared type or to a heap > allocation; as described in comment#35, GCC gets this wrong and presumably > miscompiles C code in some cases as a result. I very much would like to see such an example! The current GCC rule is quite simple - every store to memory alters the dynamic type of the stored to object to that of the store. Up to now we've had more success with that model than any other we tried before. > It seems to me that all you're allowed to do within the bounds of conformance > is: > > #1 if you can track the destination back to an object with declared type in C > code, then use its type as the effective type of the result > > #2 if you can track the destination back to a heap allocation in C code, then > copy the effective type from source to destination > > #3 otherwise (in either C or C++) erase the effective type of the destination > > (#1 and #3 will presumably result in memcpy being replaced by some operation > that updates the effective type, rather than being eliminated entirely.) Indeed. Such "operation that updates the effective type" would have come handy in a few cases already, but we do not have it right now and given past experience with variants of it (bad one, obviously) I'm not too keen of re-introducing it. That GCC elideds the memcpy roundtrip is (in the above model where every store alters the dynamic type) a bug - the memcpy internally actually discards the dynamic type info. The option of having to preserve that roundtrip isn't very appealing though. We've went to that way for cases where we now cannot remove a "redundant" store (a store with the same bit pattern but different effective type). Bottom line is I wouldn't hold my breath getting this fixed on the GCC side. Like with the partial object re-use and aggregate assignment case the C++ FE will have the option disabling type-based alias-analysis completely for some objects (but I can't see how that helps with the case referenced here).
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #39 from rguenther at suse dot de --- On Tue, 16 Jun 2020, andrew2085 at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 > > --- Comment #38 from Andrew Downing --- > > int *p; > > int x; > > if () > >p = > > else > >p = malloc (4); > > memcpy (p, q, 4); > > > > there is a single memcpy call and the standard says that both the dynamic > > type transfers (from q) and that it does not (to x). > > I would say just that, that it both does and doesn't transfer the effective > type. Meaning that you need to be conservative during optimization and > consider > p to alias both int and whatever type q is. > > > Note the C++ standard makes the placement new optional. Do you say that > > your example is incorrect with the placement new elided? > > I'm not sure what you mean about the first part about it being optional. It Somewhere the C++ standard says (or said in some "old" version) that the lifetime of an object ends when "... or the storage is re-used". Likewise lifetime of an object starts "when storage with the proper alignment and size ... is obtained". Back in time when I designed the way GCC currently works to satisfy placement new and friends I concluded the safe thing to do is to treat every memory write as changing the dynamic type of the memory location (because we have to assume it is re-use of storage). Thus for types without a non-trivial ctor/dtor you do not need to use placement new. So take your example and remove the placement new. Does that change its semantics?
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #38 from Andrew Downing --- > int *p; > int x; > if () >p = > else >p = malloc (4); > memcpy (p, q, 4); > > there is a single memcpy call and the standard says that both the dynamic > type transfers (from q) and that it does not (to x). I would say just that, that it both does and doesn't transfer the effective type. Meaning that you need to be conservative during optimization and consider p to alias both int and whatever type q is. > Note the C++ standard makes the placement new optional. Do you say that > your example is incorrect with the placement new elided? I'm not sure what you mean about the first part about it being optional. It depends what you mean by elided. I wouldn't expect any code to be generated for it either way, but I would expect the compiler to now consider the object at that address as having a different type regardless. If we pretend for a second that GCC is using pre C++20 rules for memcpy and not messing with the effective/dynamic type for the destination, then isn't this example still not going to work if GCC is treating placement new in this case as doing nothing? In f1 after s1 is called, d is still a double, and u is a pointer to uint64_t, so pointing u at d and accessing *u is still going to be UB right? I would expect d = 3.14159 to still be optimized out, because why would a store to a double affect a load from a uint64_t? I can't see how this could work in every situation unless the compiler keeps track of the type of d changing from double -> uint64_t, so it knows that stores to it when it was a double could affect loads from a uint64_t after it's type changed to uint64_t. In a more complex scenario where placement new was used conditionally with many different types, the compiler would have to consider pointers to any of those types as aliasing d afterwards. I can think of some situations where the compiler would have a very hard time proving that the address of some object didn't make it's way to a placement new somewhere else in the program. This does seem very difficult to do correctly without being very conservative and disabling a lot of optimizations, or having pretty advanced static analysis.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #37 from Richard Smith --- (In reply to Richard Biener from comment #36) > The main issue I see is that this differing expectations of C and C++ are > impossible to get correct at the same time. That is a rather bold claim. I think you can satisfy both rule sets by using the C++ rule even in C. It is conservatively correct to discard the effective / dynamic type when you see a memcpy, and the C++ semantics require you to do so. The C semantics also appear to require the same thing, if you cannot track the destination back to either an object with a declared type or to a heap allocation; as described in comment#35, GCC gets this wrong and presumably miscompiles C code in some cases as a result. It seems to me that all you're allowed to do within the bounds of conformance is: #1 if you can track the destination back to an object with declared type in C code, then use its type as the effective type of the result #2 if you can track the destination back to a heap allocation in C code, then copy the effective type from source to destination #3 otherwise (in either C or C++) erase the effective type of the destination (#1 and #3 will presumably result in memcpy being replaced by some operation that updates the effective type, rather than being eliminated entirely.)
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #36 from Richard Biener --- (In reply to Andrew Downing from comment #35) > I agree that the new implicit object creation rules sound very difficult to > implement correctly especially because the behavior in C is different. I'm > curious to see how that will all play out. > > In this situation though, if we use the C rules for what memcpy does C17 > 6.5/6 > https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/ > sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf#section.6.5, the effective > type shouldn't be changed. The declared type of both objects is known to the > compiler. In the first memcpy the declared type of the object is unsigned > char[8], in the second memcpy the declared type of the object is double. The issue I have with this special handling of objects with a declared type is that the standard assumes universal knowledge and disregards that compilers are imperfect. It's also not clear what it means for int *p; int x; if () p = else p = malloc (4); memcpy (p, q, 4); there is a single memcpy call and the standard says that both the dynamic type transfers (from q) and that it does not (to x). > Placement new changes the effective type to std::uint64_t, but that doesn't > change the behavior of memcpy. Footnote 88 says "Allocated objects have no > declared type.". I believe calling a function defined in another TU that > returns a pointer also has to be considered to return a pointer to an object > with no declared type, because the object's declaration isn't visible. In > this situation though, the declared types are visible, and so a modifying > access, or memcpy, or memmove shouldn't change the effective type. Note the C++ standard makes the placement new optional. Do you say that your example is incorrect with the placement new elided? Note you say the declared types are visible - but at least 'd' is in another function (compilers are imperfect) and is accessed via a pointer here. IIRC C++ does not have this "special-casing" of objects with declared types (it doesn't have this memcpy wording at all I think). [there's more "interesting" bits of all this dynamic type frobbing in PR79671 when partial objects are re-used] > If gcc is changing the effective type with every memcpy no matter what, that > would be the wrong thing to do right? Especially since you're saying that > it's the reason that this example isn't being compiled correctly. As you say above the C rule that memcpy does not change the dynamic type of an object with a declared type can only be fulfilled by being conservative. This is why GCC interprets memcpy as TBAA barrier but it uses the transfer of dynamic type as means to be able to elide a memcpy roundtrip as seen in your testcase. If the memcpy roundtrip has to be considered a side-effect then I see no easy way to preserve that without preserving the actual memcpy operation [without changing the intermediate representation of GCC, that is]. I'd be interested in a C testcase that we get wrong, even when relying on that special casing of objects with a declared type. It should then be possible to construct a C++ variant that expects exactly the opposite behavior. The main issue I see is that this differing expectations of C and C++ are impossible to get correct at the same time.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #35 from Andrew Downing --- I agree that the new implicit object creation rules sound very difficult to implement correctly especially because the behavior in C is different. I'm curious to see how that will all play out. In this situation though, if we use the C rules for what memcpy does C17 6.5/6 https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf#section.6.5, the effective type shouldn't be changed. The declared type of both objects is known to the compiler. In the first memcpy the declared type of the object is unsigned char[8], in the second memcpy the declared type of the object is double. Placement new changes the effective type to std::uint64_t, but that doesn't change the behavior of memcpy. Footnote 88 says "Allocated objects have no declared type.". I believe calling a function defined in another TU that returns a pointer also has to be considered to return a pointer to an object with no declared type, because the object's declaration isn't visible. In this situation though, the declared types are visible, and so a modifying access, or memcpy, or memmove shouldn't change the effective type. If gcc is changing the effective type with every memcpy no matter what, that would be the wrong thing to do right? Especially since you're saying that it's the reason that this example isn't being compiled correctly.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #34 from Richard Biener --- (In reply to Andrew Downing from comment #33) > Those are all perfectly good arguments, but the problem ended up not having > anything to do with std::launder or new implicit object creation rules or > anything else introduced in the most recent standards right? This should be > well defined in c++11 and on https://godbolt.org/z/w5FoZN. It compiles > correctly until gcc 5.1, and in all versions of the other major compilers > I've tried that will actually compile on godbolt.org. As far as I can tell, > it should also be well defined in c++98 and on if you use different types > and check the size and alignment some other way. GCC assumes that memcpy transfers the dynamic type as it does in C which makes the testcase invalid without the new implicit object creation rules (which then must rely on that memcpy behavior). Citing your new example here: #include #include #include #include static_assert(sizeof(double) == sizeof(std::uint64_t), ""); static_assert(alignof(double) == alignof(std::uint64_t), ""); std::uint64_t *pun(void *p) { char storage[sizeof(double)]; std::memcpy(storage, p, sizeof(storage)); std::uint64_t *u = new(p) std::uint64_t; std::memcpy(u,storage, sizeof(storage)); return u; } std::uint64_t f1(std::uint64_t *maybe) { double d = 3.14159; std::uint64_t *u = pun(); if(rand() == 0) { u = maybe; } return *u; } In particular in C++14 3.9/4 refers to footnote 44 which says "The intent is that the memory model of C++ is compatible with that of ISO/IEC 9899 Programming Language C" which is a laudable goal. But I see that information about the memory model and the impact of language features on it is scattered across many places in the C++ standard and some wordings sound contradictory. And I always fail to remember where all the relevant points were. But I did a thorough research of C and C++ standards when implementing what GCC does in the GCC 5 timeframe (where it was C++11 and C++14 draft state IIRC). IMHO you cannot elide memcpy if it implicitely creates an object of a type that is only determined later by (the first?) access. It would also be an extremely bad choice of semantics. Making it so that memcpy does not alter the type of the destination object is bad as well if you consider re-use of allocated storage where the last access to the old object is visible which determines the dynamic type. So what C specifies is the only viable semantics for memcpy. C has the additional restriction of declared objects which does not apply to C++ and thus GCC does not use that for C either.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #33 from Andrew Downing --- Those are all perfectly good arguments, but the problem ended up not having anything to do with std::launder or new implicit object creation rules or anything else introduced in the most recent standards right? This should be well defined in c++11 and on https://godbolt.org/z/w5FoZN. It compiles correctly until gcc 5.1, and in all versions of the other major compilers I've tried that will actually compile on godbolt.org. As far as I can tell, it should also be well defined in c++98 and on if you use different types and check the size and alignment some other way.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #32 from rguenther at suse dot de --- On Thu, 4 Jun 2020, andrew2085 at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 > > --- Comment #31 from Andrew Downing --- > What would you say is the solution here? There's a disconnect between what the > c++ standard says should work, and what actually works. I think C++ standards people must come to realize that designing how TBAA works in a compiler isn't something that can be turned around every now and then so changing requirements in the standard every now and then does not work. I think people will have to live with the reality of existing implementations. Because massive changes like this cannot be brought to older releases nor can I give any estimate on what future release of GCC might "support" this "feature" of the standard. There's the workaround of disabling type-based alias anaysis via -fno-strict-aliasing of course. Mind it took GCC about 4 to 5 major releases to get placement new work correctly. Well, to the reading of the C++11 standard. So expect us to be ready with C++20 in about 5 years. The message to the standards people should also be that C++ does not live in isolation and modern technology like link-time optimization has to cope with input from multiple source languages which means that compilers intermediate language has to cope with all of them and do optimizations expected by people using different languages.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #31 from Andrew Downing --- What would you say is the solution here? There's a disconnect between what the c++ standard says should work, and what actually works.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #30 from Richard Biener --- (In reply to Andrew Downing from comment #29) > So I think this sort of equivalent example in C shows what's going wrong in > the C++ example. https://godbolt.org/z/ZMz4Cp > > gcc knows that if the object mem points to is modified inside pun() its > effective type will change to the type of the value that is assigned because > the object mem points to has no declared type. If the argument to pun has a > declared type, the code doesn't work, like in the c++ example. As said earlier the issue is that pun() is completely elided and GCC doesn't see anything else than a simple pointer cast of its argument at the caller side (after inlining). > So for this c++ example https://godbolt.org/z/NeAJ5d could a solution be for > gcc to treat placement new as if it were a modifying access and as if it's > parameter had no declared type. So it would change the effective type of d > in f1 to uint64_t, or at least insert IL instructions to simulate that? The main issue with placement new is that it is not necessary to use placement new! In C++, for POD (or some bigger set of) types you can simply start using storage in a new type, no need for a placement new. This is why GCC treats _every_ _store_ as possibly altering the dynamic type of the stored to object. So everything is fine - until all stores [possibly altering the dynamic type] are optimized away. So with your argument we'd have to insert extra magic instructions at _every_ store and we'd have to keep those (while we could elide the actual stores). While in the high-level IL this might be feasible things get tricky in RTL land where we'd have the choice to either not do TBAA anymore or also represent these "fake" memory state affecting instructions. In the end I'd rather not venture there but indeed that removing of stores has proven an issue in the past (PR93946 for example).
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #29 from Andrew Downing --- So I think this sort of equivalent example in C shows what's going wrong in the C++ example. https://godbolt.org/z/ZMz4Cp gcc knows that if the object mem points to is modified inside pun() its effective type will change to the type of the value that is assigned because the object mem points to has no declared type. If the argument to pun has a declared type, the code doesn't work, like in the c++ example. So for this c++ example https://godbolt.org/z/NeAJ5d could a solution be for gcc to treat placement new as if it were a modifying access and as if it's parameter had no declared type. So it would change the effective type of d in f1 to uint64_t, or at least insert IL instructions to simulate that?
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #28 from Andrew Downing --- Hey that's cheating, but yea the second part did it.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #27 from rguenther at suse dot de --- On June 2, 2020 6:34:12 PM GMT+02:00, andrew2085 at gmail dot com wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 > >--- Comment #25 from Andrew Downing --- >Do you know how to change that example so that gcc's knowledge is >incomplete >and it not longer does the correct thing? Add std::launder ;) or, for example, conditionally assign another pointer incoming to the function before dereferencing it (untested).
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #26 from Andrew Downing --- I mean without modifying the definition of start_lifetime_as
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #25 from Andrew Downing --- Do you know how to change that example so that gcc's knowledge is incomplete and it not longer does the correct thing?
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #24 from rguenther at suse dot de --- On Tue, 2 Jun 2020, andrew2085 at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 > > --- Comment #23 from Andrew Downing --- > But gcc already can implement std::start_lifetime_as with no overhead. > https://godbolt.org/z/YdoEcH But it's "correct" only because your testcase is very simple and thus GCCs knowledge is complete, DTRT here. > My intent wasn't to draw attention to std::start_lifetime_as in this bug > report, I only mentioned it as the reason I came up with the original code. My > main focus was intended to be std::launder, which when used in this situation, > breaks something when it should do nothing at all. My thought was that if it > breaks something in this situation, it may break something in other situations > too. Possibly where it is actually required. It really depends on what the requirements on std::launder are. AFAIK it's an optimization barrier for the pointer.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #23 from Andrew Downing --- But gcc already can implement std::start_lifetime_as with no overhead. https://godbolt.org/z/YdoEcH My intent wasn't to draw attention to std::start_lifetime_as in this bug report, I only mentioned it as the reason I came up with the original code. My main focus was intended to be std::launder, which when used in this situation, breaks something when it should do nothing at all. My thought was that if it breaks something in this situation, it may break something in other situations too. Possibly where it is actually required.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #22 from Richard Biener --- (In reply to Richard Smith from comment #20) > (In reply to Andrew Downing from comment #19) > > Not that it would make a difference in this particular situation, but is the > > intent of P0593R6 to only allow implicitly creating an object in the > > relevant storage location where one hasn't already been implicitly or > > explicitly created? > > No, the new objects are allowed to replace existing objects. For example, > this implementation would also be correct: > > std::uint64_t* s3(double* p) { > std::memmove(p, p, sizeof(double)); > return std::launder(reinterpret_cast(p)); > } > > ... on the basis that it has defined behavior if the memmove implicitly > creates an 'uint64_t' object in the underlying storage after it (notionally) > copies the contents elsewhere and before it (notionally) copies the contents > back again. (The 'launder' is necessary in order to form a pointer to the > implicitly-created uint64_t object, because p doesn't point to that object.) Note that in GCCs view if there's a memcpy in the IL the memcpy destination has indetermine type and accesses using any effective type lvalue are well-defined. The issue with the testcase at hand is that GCC elides the memcpy (and the temporary object) completely - which was desired by the testcase author. But that loses this "barrier" from the IL. Note that GCC both implements the C and the C++ language and performs inter-CU optimization across language barriers and thus we need to find common grounds of semantics - such as memcpy. So for GCC the argument "this is C++, we don't care for C semantics" isn't productive. Note GCC also elides memcpy and memmove with identical source/destination even though this technically has barrier semantics. So IMHO std::start_lifetime_as would need first-class compiler support and be appropriately represented in the IL. Which also means it will have a non-zero overhead but less overhead than for example keeping a memmove (p, p, N) in the IL. Which leads me towards the suggestion to work on memcpy semantics since IMHO the C semantics provide the reason for what GCC does to the testcase and I find it extremely odd (and bad for general optimization that doesn't try to emulate std::start_lifetime_in_as) to require different semantics. Note we're already allowing dynamic type changes of objects with a declared type in C since that's also used in practice. Note we're already pessimizing C++ for re-using parts of objects. So IMHO no GCC bug to see here. And yes, GCC doesn't seem to be able to implement std::start_lifetime_as without larger overhead at the moment.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #21 from Richard Biener --- (In reply to Andrew Downing from comment #15) > This is all kind of besides the point anyway though, because gcc is handling > everything ok except for std::launder. std::launder is only supposed to be > an optimization barrier, but it's causing the opposite problem. Here, > std::launder is preventing an optimization that shouldn't be taking place > from NOT taking place. I've been looking at gcc's code for a while and have > gotten as far as seeing that the use of std::launder is preventing > dse_classify_store() in tree-ssa-dse.c from seeing the relationship between > double d = 3.14159; and _6 here. > > // this is right before the tree-dse3 pass (I disabled some passes to > prevent constant propagation) > f1 () > { > long unsigned int u; > double d; > long unsigned int * _6; > >[local count: 1073741824]: > > // this line is removed by tree-dse3 > d = 3.141589988261834005243144929409027099609375e+0; > > _6 = .LAUNDER (); > u_3 = MEM[(uint64_t *)_6]; > d ={v} {CLOBBER}; > return u_3; > > } > > If the implementation of std::launder in gcc simply disallows optimization > passes from seeing through it, I think that is a mistake. std::launder being > an optimization barrier means disallowing checks that enable an optimization > from seeing through it as well as allowing checks that disable an > optimization from seeing through it. Note that when you elide .LAUNDER nothing really changes in GCC other than when seeing these kind of must alias of a definition and a use GCC chooses to do-what-I-mean and allow type-punning. So I don't think the .LAUNDER implementation has an issue - of course I never understood the point of std::launder in the first place, but ...
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #20 from Richard Smith --- (In reply to Andrew Downing from comment #19) > Not that it would make a difference in this particular situation, but is the > intent of P0593R6 to only allow implicitly creating an object in the > relevant storage location where one hasn't already been implicitly or > explicitly created? No, the new objects are allowed to replace existing objects. For example, this implementation would also be correct: std::uint64_t* s3(double* p) { std::memmove(p, p, sizeof(double)); return std::launder(reinterpret_cast(p)); } ... on the basis that it has defined behavior if the memmove implicitly creates an 'uint64_t' object in the underlying storage after it (notionally) copies the contents elsewhere and before it (notionally) copies the contents back again. (The 'launder' is necessary in order to form a pointer to the implicitly-created uint64_t object, because p doesn't point to that object.)
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #19 from Andrew Downing --- Not that it would make a difference in this particular situation, but is the intent of P0593R6 to only allow implicitly creating an object in the relevant storage location where one hasn't already been implicitly or explicitly created? e.g. could the first memcpy implicitly create a double object in storage? Doing so would result in the same behavior in this situation, I'm not sure if that would be considered more defined. I'm also not sure if there could be other situations where implicitly creating a new object where another object exists would result in more defined, but unintended behavior.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #18 from Richard Smith --- (In reply to Andrew Downing from comment #17) > Also none of the behavior described in p0593 is required for this C++ > program to be well defined. All objects that are required to exists here are > created explicitly. It's not relying on the implicit creation of any > objects. This is valid C++17 code. I agree, for what it's worth. I think the only thing that might suggest otherwise is the wording in the C standard that says that memcpy copies the effective type, but that doesn't mean anything in C++ (and it's also specified in the language section of C, not the library section, so isn't part of the wording that C++ incorporates by reference). C++ doesn't have any wording that says what value an object has after you memcpy the representation of a value of a different type over it, but there isn't any provision for memcpy to change the dynamic type of the object prior to P0593 (and after P0593, memcpy is only allowed to change the dynamic type if doing so makes the program's behavior more defined).
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #17 from Andrew Downing --- Also none of the behavior described in p0593 is required for this C++ program to be well defined. All objects that are required to exists here are created explicitly. It's not relying on the implicit creation of any objects. This is valid C++17 code.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #16 from Richard Smith --- Per p0593, memcpy implicitly creates objects (of any implicit lifetime type) in the destination. It does not propagate the objects in the source memory to the destination memory, and can therefore be used to perform a bit cast. (This is different from C, where memcpy either preserves or copies the effective type depending on whether the destination has a declared type.) The s3 function in comment#1 looks correct to me (with or without the launder). Optimizing it to { return (uint64t *)p; } is incorrect, because it loses the erasure of dynamic type information that p0593 requires from memcpy in C++.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #15 from Andrew Downing --- (In reply to Richard Biener from comment #10) > (In reply to Andrew Downing from comment #8) > > From the C standard: > > If a value is copied into an object having no declared type using memcpy or > > memmove, or is copied as an array of character type, then the effective type > > of the modified object for that access and for subsequent accesses that do > > not modify the value is the effective type of the object from which the > > value is copied, if it has one. For all other accesses to an object having > > no declared type, the effective type of the object is simply the type of the > > lvalue used for the access. > > > > So even using C semantics the effective type of storage and *t should not be > > changed, because they already have a declared type. > > But in your testcases 't' is a pointer and the declared object is not > visible. > So the only thing an implementation can do is take advantage of the declared > type for optimization when it is visible. That's not exactly relevant in this case though since this is C++. In C++ new has special semantics. Regardless of the type of the variable being assigned to, or the declared type of the object that the pointer argument points to, it starts the lifetime of an object of the specified type in the storage, and ends the lifetime of whatever is there. There is no equivalent of this in C. In C you can change the effective type of an object with allocated storage duration since it has no declared type, but you can not change the declared type of an object with automatic storage duration since it has a declared type. double d; has a declared type; >Note that for C++ types you can apply memcpy to the placement new is not >needed since object re-use terminates lifetime of the previous object and >starts lifetime of a new one. This means that your example can be >simplified to memcpy is needed because starting the lifetime of a new object in the storage of an existing object does not re-use the old objects representation. If the default trivial constructor is used the standard explicitly states that the value of the new object is indeterminate and accessing the value of the object will result in undefined behavior. This is why in p0593r6 section 3.8 they mention copying the object representation to another location and then copying it back after placement new. There is no way in C++ to pun in C like you can with a union with no intermediate steps. Yes I expect all these operations to be elided away. They are simply there because the standard says they have to be. In other compilers they may be required to ensure well defined behavior. This isn't code that I'm actually using anywhere I was just reading p0593r6 and tested their described implementation of std::start_lifetime_as and found this strange behavior. There is an underlying bug here, and if it's popping up in this example, it will pop up somewhere else eventually. This is all kind of besides the point anyway though, because gcc is handling everything ok except for std::launder. std::launder is only supposed to be an optimization barrier, but it's causing the opposite problem. Here, std::launder is preventing an optimization that shouldn't be taking place from NOT taking place. I've been looking at gcc's code for a while and have gotten as far as seeing that the use of std::launder is preventing dse_classify_store() in tree-ssa-dse.c from seeing the relationship between double d = 3.14159; and _6 here. // this is right before the tree-dse3 pass (I disabled some passes to prevent constant propagation) f1 () { long unsigned int u; double d; long unsigned int * _6; [local count: 1073741824]: // this line is removed by tree-dse3 d = 3.141589988261834005243144929409027099609375e+0; _6 = .LAUNDER (); u_3 = MEM[(uint64_t *)_6]; d ={v} {CLOBBER}; return u_3; } If the implementation of std::launder in gcc simply disallows optimization passes from seeing through it, I think that is a mistake. std::launder being an optimization barrier means disallowing checks that enable an optimization from seeing through it as well as allowing checks that disable an optimization from seeing through it.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #14 from rguenther at suse dot de --- On Fri, 29 May 2020, ed at catmur dot uk wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 > > --- Comment #12 from Ed Catmur --- > (In reply to Richard Biener from comment #11) > > Note that for C++ types you can apply memcpy to the placement new is not > > needed since object re-use terminates lifetime of the previous object and > > starts lifetime of a new one. > > Under P0593R6 it has the effect of implicitly creating objects on demand. > Effectively it is supposed to "curse" the double and "bless" the subsequent > uint64_t. Invoking P0593 may be jumping the gun since it's still in LWG, but > Richard (Smith) wants it retroactively applied to C++20 IS as a DR, and that > could still happen. I believe such "curse"/"bless" operation cannot be implemented without overhead(*) and thus I would not recommend to make it apply to all placement new operations. > > Note that while your example performs memcpy dances you are probably > > after a solution that elides all generated code? > > Sure, I assume that memcpy of anything smaller than a page will be elided :) (*) then no longer. > > Note that I do not belive making your examples work as you intend is > > possible in an actual implementation without sacrifying all > > type-based alias analysis. > > Ouch. You might be asked to if and when P0593 goes in (again, assuming I've > understood it correctly). Would it be appropriate to find out what Ville > thinks? Definitely.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 Jonathan Wakely changed: What|Removed |Added CC||richard-gccbugzilla@metafoo ||.co.uk --- Comment #13 from Jonathan Wakely --- (In reply to Ed Catmur from comment #12) It's Richard's paper now, not Ville's, so I've CC'd him. It's unclear whether std::start_lifetime_as is expected to be implementable in C++ as a normal library function. If there's compiler support then it can do things the sample implementations shows here can't do. And start_lifetime_as isn't planned to be treated as a DR for C++20.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #12 from Ed Catmur --- (In reply to Richard Biener from comment #11) > Note that for C++ types you can apply memcpy to the placement new is not > needed since object re-use terminates lifetime of the previous object and > starts lifetime of a new one. Under P0593R6 it has the effect of implicitly creating objects on demand. Effectively it is supposed to "curse" the double and "bless" the subsequent uint64_t. Invoking P0593 may be jumping the gun since it's still in LWG, but Richard (Smith) wants it retroactively applied to C++20 IS as a DR, and that could still happen. > Note that while your example performs memcpy dances you are probably > after a solution that elides all generated code? Sure, I assume that memcpy of anything smaller than a page will be elided :) > Note that I do not belive making your examples work as you intend is > possible in an actual implementation without sacrifying all > type-based alias analysis. Ouch. You might be asked to if and when P0593 goes in (again, assuming I've understood it correctly). Would it be appropriate to find out what Ville thinks?
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #11 from Richard Biener --- (In reply to Ed Catmur from comment #9) > (In reply to Jonathan Wakely from comment #4) > > I don't know the answer, and I don't know why it's useful to try this > > anyway. > > If I'm reading P0593 correctly (I may not be), this would be a valid > implementation of start_lifetime_as: > > template > inline T* start_lifetime_as(void* p) { > std::byte storage[sizeof(T)]; > std::memcpy(storage, p, sizeof(T)); > auto q = new (p) std::byte[sizeof(T)]; > std::memcpy(q, storage, sizeof(T)); > auto t = reinterpret_cast(q); > return std::launder(t); > } > > But this has the same issue: https://godbolt.org/z/YYtciP I think there is no way to pun the dynamic type of an object without altering its current storage representation. You can do punning via a union but that wouldn't change its effective type. Note that for C++ types you can apply memcpy to the placement new is not needed since object re-use terminates lifetime of the previous object and starts lifetime of a new one. This means that your example can be simplified to template inline T* start_lifetime_as(void* p) { return reinterpret_cast(q); } easily showing why that cannot be the intention. Note that while your example performs memcpy dances you are probably after a solution that elides all generated code? Note that I do not belive making your examples work as you intend is possible in an actual implementation without sacrifying all type-based alias analysis.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #10 from Richard Biener --- (In reply to Andrew Downing from comment #8) > From the C standard: > If a value is copied into an object having no declared type using memcpy or > memmove, or is copied as an array of character type, then the effective type > of the modified object for that access and for subsequent accesses that do > not modify the value is the effective type of the object from which the > value is copied, if it has one. For all other accesses to an object having > no declared type, the effective type of the object is simply the type of the > lvalue used for the access. > > So even using C semantics the effective type of storage and *t should not be > changed, because they already have a declared type. But in your testcases 't' is a pointer and the declared object is not visible. So the only thing an implementation can do is take advantage of the declared type for optimization when it is visible.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #9 from Ed Catmur --- (In reply to Jonathan Wakely from comment #4) > I don't know the answer, and I don't know why it's useful to try this anyway. If I'm reading P0593 correctly (I may not be), this would be a valid implementation of start_lifetime_as: template inline T* start_lifetime_as(void* p) { std::byte storage[sizeof(T)]; std::memcpy(storage, p, sizeof(T)); auto q = new (p) std::byte[sizeof(T)]; std::memcpy(q, storage, sizeof(T)); auto t = reinterpret_cast(q); return std::launder(t); } But this has the same issue: https://godbolt.org/z/YYtciP
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #8 from Andrew Downing --- >From the C standard: If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access. So even using C semantics the effective type of storage and *t should not be changed, because they already have a declared type.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #7 from Andrew Downing --- (In reply to Jonathan Wakely from comment #6) > (In reply to Andrew Downing from comment #5) > > Also, I'm not sure if operations that implicitly create > > objects in storage are allowed to do so if an object has already explicitly > > created in that storage (from new). > > The lifetime of the object created with new ends as soon as the storage is > reused for another object. But I'm not sure if copying new bytes to it does > reuse the storage or not. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0593r6.html "We could specify that implicit object creation happens automatically at any program point that relies on an object existing." I don't believe operations that implicitly create objects are supposed to do so if an object has already implicitly or explicitly been created in that storage, unless that object is a char/unsigned char/std::byte array, since reading out those back out is always valid anyway.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #6 from Jonathan Wakely --- (In reply to Andrew Downing from comment #5) > Also, I'm not sure if operations that implicitly create > objects in storage are allowed to do so if an object has already explicitly > created in that storage (from new). The lifetime of the object created with new ends as soon as the storage is reused for another object. But I'm not sure if copying new bytes to it does reuse the storage or not.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #5 from Andrew Downing --- (In reply to Richard Biener from comment #1) > I think std::launder merely acts as optimization barrier here and without we > manage to propagate the constant. We still "miscompile" things dependent on > what exactly the C++ standard says. When seeing > > std::uint64_t* s3(double* p) { > unsigned char storage[sizeof(std::uint64_t)]; > std::memcpy(storage, p, sizeof(storage)); > auto t = new(p) std::uint64_t; > std::memcpy(t, storage, sizeof(storage)); > return t; > } > > the placement new has no effect (it's just passing through a pointer) and > memcpy has C semantics, transfering the active dynamic type of 'p' > through 'storage' back to 'p'. > > std::uint64_t f3() { > double d = 3.14159; > return *s3(); > } > > ... which is still 'double'. Which you then access via an lvalue of type > uint64_t which invokes undefined behavior. So in GCCs implementation > reading of relevant standards you need -fno-strict-aliasing and your program > is not conforming. > > So what goes on is that GCC optimizes s3 to just { return (uint64t *)p; } > which makes f3 effectively do > > double d = 3.14159; > return *(uint64_t *) > > which arguably is bogus. Without the std::launder we are nice to the > user and "optimize" the above to return the correct value. With > std::launder we cannot do this since it breaks the pointer flow and > we'll DSE the initialization of 'd' because it is not used (due to the > undefinedness in case the load would alias it). The placement new (in each of s1/s2/s3) shouldn't do nothing though. That line should create a std::uint64_t object in the storage of the double in each of f1/f2/f3, and end the lifetime of the double. Physically it does nothing, but in the C++ object model, it ends the lifetime of the double, and begins the lifetime of a std::uint64_t. Since the default trivial constructor is used no initialization is done, and the std::uint64_t has indeterminate value (according to the standard). The next line copies the object representation of the double, which was saved to separate storage, to the std::uint64_t. That is allowed with std::memcpy because they are both trivially copyable types. Operations that implicitly create objects only do so if it would give the program defined behavior. If the first std::memcpy implicitly creates a double object in storage, no harm done, but if the second std::memcpy implicitly ends the lifetime of the std::uint64_t pointed to by 't' and begins the lifetime of a double in it's place that wouldn't give the program defined behavior because 't' is being returned as a std::uint64_t* and is later dereferenced. Also, I'm not sure if operations that implicitly create objects in storage are allowed to do so if an object has already explicitly created in that storage (from new).
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #4 from Jonathan Wakely --- I don't know the answer, and I don't know why it's useful to try this anyway.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #3 from rguenther at suse dot de --- On Wed, 27 May 2020, redi at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 > > --- Comment #2 from Jonathan Wakely --- > Using > > auto t = new(p) std::uint64_t; > std::memcpy(t, std::launder(storage), sizeof(storage)); > return t; > > also prevents GCC from propagating the dynamic type of p to t. So the language lawyer question is whether the testcase is valid or not and what std::launder makes a difference semantics wise (the dynamic type is still transfered, just the compiler no longer knows which one it is).
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 --- Comment #2 from Jonathan Wakely --- Using auto t = new(p) std::uint64_t; std::memcpy(t, std::launder(storage), sizeof(storage)); return t; also prevents GCC from propagating the dynamic type of p to t.
[Bug c++/95349] Using std::launder(p) produces unexpected behavior where (p) produces expected behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95349 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org Keywords||alias, wrong-code --- Comment #1 from Richard Biener --- I think std::launder merely acts as optimization barrier here and without we manage to propagate the constant. We still "miscompile" things dependent on what exactly the C++ standard says. When seeing std::uint64_t* s3(double* p) { unsigned char storage[sizeof(std::uint64_t)]; std::memcpy(storage, p, sizeof(storage)); auto t = new(p) std::uint64_t; std::memcpy(t, storage, sizeof(storage)); return t; } the placement new has no effect (it's just passing through a pointer) and memcpy has C semantics, transfering the active dynamic type of 'p' through 'storage' back to 'p'. std::uint64_t f3() { double d = 3.14159; return *s3(); } ... which is still 'double'. Which you then access via an lvalue of type uint64_t which invokes undefined behavior. So in GCCs implementation reading of relevant standards you need -fno-strict-aliasing and your program is not conforming. So what goes on is that GCC optimizes s3 to just { return (uint64t *)p; } which makes f3 effectively do double d = 3.14159; return *(uint64_t *) which arguably is bogus. Without the std::launder we are nice to the user and "optimize" the above to return the correct value. With std::launder we cannot do this since it breaks the pointer flow and we'll DSE the initialization of 'd' because it is not used (due to the undefinedness in case the load would alias it).