Hi everyone, I've been watching this discussion. On Fri, 2020 May 1 18:52-04:00, Bruno Haible wrote: > > Yes, this is unlikely. In a world where people routinely do a "git pull" from > upstream repositories and send patches or pull requests upstream, every > automated downstream manipulation of the source code - even as small as > transforming CR/LF to LF - becomes a PITA.
Much agreed. For what it's worth, I'll mention the following points: * XLC on z/OS does not appear to support u8"..." strings, either in my tests or in the documentation I've searched. The most I can confirm is support for u"..." (UTF-16) and U"..." (UTF-32) literals. * When source code is brought in to a z/OS system for compilation, it is typically blanket-converted to e.g. IBM-1047 (which maps one-to-one to Latin-1) as the first step. Same for scripts and other files (binary blobs become a headache, yes). It is possible to coerce XLC to compile C source in ASCII encoding, but this never happens in practice, because the shell/make interpreters will choke on ASCII input well before that point. * UTF-8 characters in a source file is an awkward situation anyway, because the z/OS user environment itself does not support multibyte encodings. The typical (EBCDIC) encodings used are all single-byte. UTF-EBCDIC exists but it is not a thing on z/OS. * The general assumption is that programs running on z/OS may process UTF-8 data (multibyte functions are provided, iconv knows about UTF-8, etc.), but their interaction with the user environment is entirely through a single-byte encoding. * Obviously, the set of users who interact with a mainframe directly through a Unix shell is very small, which is why encoding support in the z/OS user environment feels like a throwback to 1999. * I'm not aware of many cases where string-literal encodings have been an issue in z/OS; the immediate example that comes to mind is e.g. gnulib/tests/test-iconv-utf.c, which requires test strings to be ASCII- encoded. You can see the use of XLC's "#pragma convert()" there. But routine scenarios, like getopt() option letters, don't need to do anything special to work as intended. * If there are any tricky encoding-related issues you are trying to solve, I'm of course happy to try out proposed solutions :-) --Daniel -- Daniel Richard G. || sk...@iskunk.org My ASCII-art .sig got a bad case of Times New Roman.