On 06/07/2019 12:43, Mutz, Marc via Development wrote:
C++20 is coming along, and it brings a disruptive change, one that far surpasses the C++17 noexcept break: u8"Hello" is now const char8_t[], no longer const char[].To estimate the amount of breakage this will cause, assuming that using u8"" is good practice today, to indicate that a string is in UTF-8. I've tried to have at least QByteArray not break... and failed.
The fact that is good practice is actually questionable, SG16 reports that u8 encounters a very very limited adoption (and I, for one, have not been suggesting its usage until the C++2a situation is clarified):
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r2.html
Code surveys have so far revealed little use of u8 literals
The initial idea is simple enough: add const char8_t* overloads for const char* functions. This breaks passing nullptr, so you also add std::nullptr_t overloads. This, however, still doesn't fix the case where a 0 is passed. I've expected that the std::nullptr_t overload is a preferred match over the const char[8_t]* ones, but GCC 9.1 disagrees, and tells me it's still ambiguous. So, if GCC is right, we have no way of adapting our API to not break in C++20. So we need to decide what to break: a) using 0 for nullptr, or b) using u8"Hello" at all The forward-looking choice would be to break (a) and support (b).
In the general case: break 0 instead of nullptr. Such code would fail anyhow if one starts adding e.g. overloads taking other pointer types, not specifically char8_t*; and adding overloads has to be acceptable in the general case. Plus: we already have warnings for using 0 as nullptr constant, and clang-tidy can automate migration. On the other hand, I'm not sure about MSVC.
In the specific case: are we sure it makes sense to add a char8_t constructor to QByteArray? Currently sits in the middle of being a pure "std::byte vector" (e.g. it's used to transmit raw bytes from I/O devices. etc) and a US-ASCII (?) string (e.g. given some of its APIs, like toUpper()). By no means it's a container of UTF-8 encoded strings and we shouldn't give the illusion that it is.
Of course there's plenty of other APIs that instead will need a resolution... just to name one: QString::fromUtf8.
My 2 c, -- Giuseppe D'Angelo | [email protected] | Senior Software Engineer KDAB (France) S.A.S., a KDAB Group company Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com KDAB - The Qt, C++ and OpenGL Experts
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Development mailing list [email protected] https://lists.qt-project.org/listinfo/development
