Hans-Peter Diettrich wrote:
Sven Barth schrieb:
On 30.01.2012 20:31, steve smithers wrote:
Hans-Peter Diettrich wrote on Mon, 30 Jan 2012 17:40:27 +0100
Existing source code frequently assumes ASCII encoding. The obvious are
upper/lowercase conversions, by and/or or add/sub constant values to the
characters. It will be hell to find and fix all such code in the
compiler and RTL, even if only the constants have to be modified for
EBCDIC. Even code with the assumed order of common characters (' '< '0'
<  'A'<  'a') has to be found and fixed manually - how would you even
*find* code with such implicit assumptions?

It does indeed. I am aware of the problems inherent in this. But the RTL has to be more or less rewritten anyway to support OS. OS is a very different
animal to Windows or Linux.

The RTL consists of two parts (though the border is not easily visible): a platform independant one and a platform dependant one. A port to a different target normally only includes touching the platform dependant one, but a port to 370 also requires touching the platform independant one. This is what DoDi talks about.

It's not anything the compiler could solve. Find out what will happen on e.g.
  for c := 'A' to 'Z' do ...
  for c := '0' to 'Z' do ...
(where the literals 'A' etc. could be named constants, or computed values)

With EBCDIC encoding the second loop will never be entered!

@other devs: Could the code page aware AnsiString type be of any help here?

Only at the I/O side, when files are read/written, or when strings (filenames!) are sent or received via the OS API. The latter reminds me of the Windows OEM charset, used in console I/O, which could be exchanged to mean EBCDIC in IBM consoles.

Unfortunately the Encoding is available only with *strings*, not with single characters. New types like EBCDICchar could be introduced, different from AnsiChar, and a directive telling the compiler "literals are EBCDIC" or "Char is EBCDICchar".

I'd suggest that the thing to do is to first target the compiler at Linux, i.e. ASCII, hosted on a PC. Once that is adequately working branch the RTL for EBCDIC, with the intention that this is basically a set of conversion patches and that the master remains ASCII.

Or of this isn't acceptable because the IBM developers feel we're trying to force them into our image, let's meet half way and use Solaris which nobody really enjoys.

--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to