Re: Is HLASM efficient WAS: Telum and SpyreWAS: Vector instruction performance

Jonathan Scott Thu, 28 Aug 2025 10:38:48 -0700

I agree that the design of the MACROCASE option was confusing, but it was too 
late to change it by the time I got involved!

ASCII and Unicode constants only support characters which can be entered in the 
current EBCDIC SBCS input code page.  If the input code page is CECP, that is 
essentially the same character set as the first 256 code points of Unicode (and 
ISO 8859-1), including the usual western European accented characters.  If it 
is Euro, the Euro symbol is supported.  If it is Latin-9 code page 924, some 
other European accented letters are supported.

The Unicode representation for generated constants is selected using the 
UNICODE option to specify the corresponding code page number and the 
CODEPAGE(LOCAL) option to specify that conversion from EBCDIC should use 
standard internal tables.  The following are supported:

UTF-16BE: 1200
UTF-16LE: 1202
UTF-8: 1208

Examples:

CU'é' (e with acute accent) with UNICODE code page 1200 gives x'00E9', with 
1202 gives x'E900' and with 1208 gives x'C3A9'.

CU'€' (Euro) with a Euro EBCDIC code page and UNICODE code page 1200 gives 
x'20AC', with 1202 gives x'AC20' and with 1208 gives x'E282AC'.

There are currently no supported characters which overflow UTF-16, so there is 
no issue with surrogate codes.

The implementation of UTF-8 was particularly tricky because the current EBCDIC 
and UNICODE options in effect may affect the number of output bytes for a given 
input byte.  This means that if a DC for a CU-type constant gets deferred, the 
assembler must keep track of the EBCDIC and UNICODE options which were in 
effect for that statement and use them for any subsequent retry.

(As I'm now retired, I no longer have access to IBM internal information, so 
some of the above is from memory, but I hope I remembered it correctly.)

Jonathan Scott

-----Original Message-----
From: IBM Mainframe Assembler List <ASSEMBLER-LIST@LISTSERV.UGA.EDU> On Behalf 
Of Paul Gilmartin
Sent: 28 August 2025 17:58
To: ASSEMBLER-LIST@LISTSERV.UGA.EDU
Subject: Re: Is HLASM efficient WAS: Telum and SpyreWAS: Vector instruction 
performance

On 8/28/25 02:20, Jonathan Scott wrote:
> HLASM itself contains a lot of mixed-case assembler source, and the HLASM 
> operating system interfaces for MVS, CMS and Linux are mostly written in 
> mixed-case PL/X.  There are indeed some limitations on macro keyword values, 
> but an increasing proportion of macros have been coded or modified to support 
> lower case values.
>     ...
I consider MACROCASE to be a design blunder.  For all other options, 
COMPAT(option) provides behavior compatible with Assembler H; COMPAT(NO option) 
provides incompatible behavior.  However, for the *SAME* source code,
COMPAT(MACROCASE) provides incompatible behavior; for compatible behavior, 
COMPAT(NOMACROCASE) is necessary.

> Data type CU is Unicode, which has nothing to do with upper case.  A macro 
> can convert a string to upper case using the UPPER built-in function.
>     ...
How does that work? can I code something as simple as:
     DC  CU'π' and get the value of x'cf80' for CCSID 1209?

--
Thanks,
gil

Re: Is HLASM efficient WAS: Telum and SpyreWAS: Vector instruction performance

Reply via email to