On 2017-11-02, at 08:22:24, Jonathan Scott wrote: > > HLASM support for ASCII, EBCDIC and Unicode character self-defining > constants (data types CA, CE and CU) has now been implemented in APAR > PI89365 and the PTFs are now available: > > http://www.ibm.com/support/docview.wss?uid=isg1PI89365 > > This is still based on a fixed code page 37 to 819 translate table. > One day we hope to get round to making that more general, but that > would be a much bigger development item. > Thanks.
I would suggest, as a manageable development item, that HLASM continue to operate in CP 037 internally, but should support input translation from 500, 1047, 819, ... to 037, provided that a bijective translation is available, even as on Linux it translates input from 819 to 037. This should have file granularity so that if a 1047 member COPYs a 037 member, that member should be treated as 037, but processing reverts to 1047 when control returns to the parent. "DC C'['" should produce the same binary output regardless of input code page. Respect either tagging of UNIX files, PARM option, or a PRAGMA at the beginning of the individual member. UTF-8? Ugh. Pretend it's 819 and translate to 037. "DC CA'whatever'" would undo the 819->037 translation and make proper UTF-8. Indexing into MBCS (particularly UTF-8) character strings is a development item hardly worth addressing. Some Linux utilities shirk the task. (I suspect there will be counter-proposals) -- gil
