https://issues.dlang.org/show_bug.cgi?id=23186
--- Comment #4 from Richard Cattermole <[email protected]> --- (In reply to Dennis from comment #3) > (In reply to Richard Cattermole from comment #2) > > No, this isn't an ABI thing, it's about encodings. > > I don't follow, do you have a reference for me? I'm looking at: > > https://en.wikipedia.org/wiki/UTF-16 > > "Each Unicode code point is encoded either as one or two 16-bit code units. > How these 16-bit codes are stored as bytes then depends on the 'endianness' > of the text file or communication protocol." > > The `wchar` type is an integer, the 16-bit code. No integral operations on a > `wchar` reveal the endianness, only once you reinterpret cast 'the text > file' (a `ubyte[]`) will endianness come up, but at that point I think it's > no different than casting a `ubyte[]` to a `ushort[]`. We don't have BE and > LE `short` types either. Indeed. Integers you kinda expect that it is the same as cpu endian. But you cannot assume the same for UTF (hence we should document it). > > However, it can be kept pretty simple something like `Unicode 8-bit code > > point with matching target endian`. > > There's no endian difference for 8-bit code points, or are we talking about > bit order instead of byte order? That should have been UTF-16 or UTF-32, but its the same. --
