Re: [fpc-other] Stanford Pascal Compiler successfully ported to Windows, OS/2 and Linux

2016-12-24 Thread Bernd Oppolzer


Am 24.12.2016 um 12:50 schrieb Mark Morgan Lloyd:

On 24/12/16 11:30, Bernd Oppolzer wrote:


chars in
the (character) P-Code file had to be converted to character 
constants; all

places where character A - for example - was represented as numeric 193
(which is EBCDIC 'A') had to be found and corrected. Even such places 
where
the reference to 193 was not recognized at first sight, that is: 
offsets in

branch tables and bit strings representing sets.


I think you've made creditable progress in a difficult area. What are 
you doing about PRED() and SUCC() as applied to CHARs?


Anybody with any sort of interest in mainframes is going to have to 
consider EBCDIC for quite some while, but unfortunately there are 
still people who insist that it's flawless. One of our wiki pages has 
somebody confirm that EBCDIC has ^, but he then goes on to admit that 
it's not in all codepages...



Thank you.

I think about "portability" in a certain way; to make it clear:

of course it is possible to write programs that are not portable
using my "new" compiler.

You are mentioning SUCC and PRED with CHAR; that is a very cood example.
These functions are implemented based on the underlying character set;
that means, that SUCC('R') is not 'S' on EBCDIC, because there is a gap 
between

'R' and 'S in the EBCDIC codepage (six other characters between 'R' and 'S'
which are not alphabetic).

This is a portability problem which appears at the source code level (!)
and cannot be healed by the compiler. It is the same with the C language,
and the sensible programmer has to deal with this, if he or she wants to
have his or her programs really portable.

My problems with the Stanford compiler were different; if the compiler 
generates
code which will not run on a platform using a different code page, 
because it generates
branch tables when implementing case statements that imply a certain 
code page,
this is a big problem and has to be fixed. The compiler implementor has 
to find a
representation (in the P-Code, in this case), which will work on every 
platform, that is:
which is independent of the code base - and does not prevent the 
optimizations

done by the later stages of the compiler.

Same goes for the bit string representation of sets of char; in this 
case, the construction

of the bit string has to be deferred until the code page can be determined
(P-Code translation or interpretation time). On Windows etc., the P-Code 
interpreter
"translates" the P-Code to an internal representation on startup, and 
that's the
time when the "portable" representation of set constants (of char) are 
translated

to the bit string representation. See my web site for details.

Regarding ^:

"my" compiler supports different representations for the pointer symbol, 
and for other

critical symbols, too:

^  @  ->   for the pointer symbol (I use -> most of the time)
[   (.   (/   for arrays
{   (*  /*   for comments  ("comment" is supported, too, for historic 
reasons)


no problem with EBCDIC. I do the editing on Windows most of the time and 
move the

files to Hercules using the socket reader.

BTW: you find the compiler sources and a history of the extensions that 
I applied in the last

months (or years) on the web site, too.

Kind regards

Bernd

___
fpc-other maillist  -  fpc-other@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-other


Re: [fpc-other] Stanford Pascal Compiler successfully ported to Windows, OS/2 and Linux

2016-12-24 Thread Mark Morgan Lloyd

On 24/12/16 11:30, Bernd Oppolzer wrote:

Hello Mark,

on several occasions, I looked what FPC does, when I extended the
Stanford compiler, for example when I added support for direct write
of scalars.

At one time I recall that I decided explicitly to take another direction;
that was when I allowed shorter string constants to be assigned to
longer ones, for example:

var x: array [1 .. 200] of char;

x := 'some string';

IIRC, FPC fills with hex zeroes, but I prefer blanks - the blank
representation
of the target system ... which is different on the target systems; this
should
show to some of the readers here which are not familiar with IBM mainframes
some of the difficulties I had to get the P-Code really portable ... all


I think the issue of padding partially-initialised data structures is 
something that merits wider discussion. Provided of course that we can 
avoid the sort of arcana that Paul/Kerravon is enmired in :-)



chars in
the (character) P-Code file had to be converted to character constants; all
places where character A - for example - was represented as numeric 193
(which is EBCDIC 'A') had to be found and corrected. Even such places where
the reference to 193 was not recognized at first sight, that is: offsets in
branch tables and bit strings representing sets.


I think you've made creditable progress in a difficult area. What are 
you doing about PRED() and SUCC() as applied to CHARs?


Anybody with any sort of interest in mainframes is going to have to 
consider EBCDIC for quite some while, but unfortunately there are still 
people who insist that it's flawless. One of our wiki pages has somebody 
confirm that EBCDIC has ^, but he then goes on to admit that it's not in 
all codepages...


--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
___
fpc-other maillist  -  fpc-other@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-other