On Sun, 9 Jul 2023 21:15:18 +0000, Seymour J Metz wrote:
>UTF-8 is an encoding of Unicode and the size of a character depends on its
>code point. I'm not sure what you mean by a UTF-8 character.
>
I stand corrected. "A sequence of octets valid in UTF9."
>UNIX is ASCII-based, and I'm not aware of any new certification requirement to
>support Unicode. However, as a practical matter the marketplace demands it.
>
UNIX branding makes no requirement of ASCII. otherwise z/OS wouldn't have
qualified.
>My understanding is that windows file names are UCS-2 only, and only
>characters in the BMP are allowed.
>
Oh. Linux seems relatively tolerant.
In reply to your earlier question, yes:
1736 $ touch $( printf '\300\n' | iconv -f iso8859-1 -t UTF-8 )
1737 $ ls -l $( printf '\300\n' | iconv -f iso8859-1 -t UTF-8 )
-rw-r--r-- 1 paulgilm wheel 0 Jul 9 15:26 À
1738 $
Otherwise, I understand that IBM mainframes store Hebrew text backward in files,
so my example containing conventional Hebrew text would appear garbled in
ISPF Edit in a file tagged 1208 and t terminal with 424.
--
gil
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN