Legacy bidi is a can of worms.

________________________________________
From: IBM Mainframe Discussion List <[email protected]> on behalf of 
Paul Gilmartin <[email protected]>
Sent: Sunday, July 9, 2023 5:45 PM
To: [email protected]
Subject: Re: Editor (was: Code Page for dataset names)

On Sun, 9 Jul 2023 21:15:18 +0000, Seymour J Metz wrote:

>UTF-8 is an encoding of Unicode and the size of a character depends on its 
>code point. I'm not sure what you mean by a UTF-8 character.
>
I stand corrected.  "A sequence of octets valid in UTF9."

>UNIX is ASCII-based, and I'm not aware of any new certification requirement to 
>support Unicode. However, as a practical matter the marketplace demands it.
>
UNIX branding makes no requirement of ASCII.  otherwise z/OS wouldn't have 
qualified.

>My understanding is that windows file names are UCS-2 only, and only 
>characters in the BMP are allowed.
>
Oh.  Linux seems relatively tolerant.

In reply to your earlier question, yes:
    1736 $ touch $( printf '\300\n' | iconv -f iso8859-1 -t UTF-8 )
    1737 $ ls -l $( printf '\300\n' | iconv -f iso8859-1 -t UTF-8 )
    -rw-r--r--  1 paulgilm  wheel  0 Jul  9 15:26 À
    1738 $

Otherwise, I understand that IBM mainframes store Hebrew text backward in files,
so my example containing conventional Hebrew text would appear garbled in
ISPF Edit in a file tagged 1208 and t terminal with 424.

--
gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to