I tried prototyping support for ASCII and EBCDIC (CA and CE)
self-defining constants in Assembler expressions yesterday
evening. The main problem turned out to be that the internal
representation of the text for a self-defining character string
doesn't include the start and end quotes and only has one byte to
save the type character, so I had to change the code to include
the "A" or "E" type extension and the leading quote as part of
the value, which then required special-case logic to suppress the
leading quote at the various places where the value is
substituted back into text. At present, I have not attempted to
implement the Unicode type CU for self-defining constants.
The CA type is translated to ASCII using the same table as is
used to translate ASCII DC values, which is currently a fixed
table that maps a subset of code page 37 to 7-bit printable
ASCII, leaving unchanged any codes that do not map to ASCII.
The CE type defines a value which is not translated even when the
TRANSLATE option and COMPAT(TRANSDT) are in effect, as for the
corresponding type on DC.
Here's some sample output from a test:
000000 00000 00030 1 TESTASDT CSECT ,
000000 92F0 1000 00000 2 MVI 0(1),C'0'
000004 9230 1000 00000 3 MVI 0(1),CA'0'
000008 A512 F0F1 0F0F1 4 IILH 1,C'01'
00000C A512 3031 03031 5 IILH 1,CA'01'
000010 A512 F0F1 0F0F1 6 IILH 1,C'01'
000014 A512 F0F1 0F0F1 7 IILH 1,CE'01'
000018 92BA 1000 00000 8 MVI 0(1),C'['
00001C 925B 1000 00000 9 MVI 0(1),CA'['
000020 A512 BABB 0BABB 10 IILH 1,C'[]'
000024 A512 5B5D 05B5D 11 IILH 1,CA'[]'
000028 A512 BABB 0BABB 12 IILH 1,C'[]'
00002C A512 BABB 0BABB 13 IILH 1,CE'[]'
00030 14 ASCII_0 EQU CA'0'
00031 15 ASCII_1 EQU 1+CA'0'
16 END ,
The changes (about 60 lines of code plus various comments) are so
far only a prototype, with minimal testing, but I think I've
proved that support is feasible, so there's a good chance we may
be able to make that support available soon for Assembler
expressions. For consistency this syntax should also be
supported in SETA expressions but I haven't yet had time to look
at that. And of course it all needs documentation updates.
Square brackets are another problem, as usual. The square
brackets in the test were in code page 37 (hex BA and BB) which
does not display correctly here in the UK (where they are hex B1
and BB). We normally use code page 1047 for product code anyway,
where square brackets are hex AD and BD as in TEXT code pages and
the C/370 compiler. I think that the ASCII translation should
probably use 819 rather than 7-bit ASCII as the target code page,
and should be made sensitive to the existing CODEPAGE option,
which currently only affects Unicode. Also, the CODEPAGE option
should be extended to cover a wider range, including 1047.
Jonathan Scott
HLASM team, IBM Hursley, UK