Nordlöw:
I believe defining a complete random sampling of all code units in dchar is a good start right? This can then be reused to lazily convert while filling in a string and wstring.
Several combinations of unicode chars are not meaningful/valid (like pairs of ligatures). Any thing that has to work correctly with Unicode is complex.
Bye, bearophile