Re: [fpc-other] Interpreter speed

2023-04-29 Thread Bernd Oppolzer via fpc-other



Am 29.04.2023 um 19:48 schrieb geneb via fpc-other:

On Sat, 29 Apr 2023, Bernd Oppolzer via fpc-other wrote:

This may be slightly off-topic, but I can tell you some facts about 
my Stanford Pascal compiler,

which runs

- native on z Mainframe machines (which may count as a RISC machine, 
given the instruction set used)

- emulated by Hercules, which is a emulator of z Mainframes
- by emulating P-Code, which is a (sort of) byte code for Pascal ... 
the P-Code (which is pure text, portable)
is translated to a byte code representation before execution; static 
linking is also done in this stage




Bernd, is the dialect used by that compiler "standard" Pascal, UCSD 
Pascal, Turbo-compatible, or?


tnx.

g.

The dialect implemented by the New Stanford Pascal compiler is 
"standard" Pascal with some extensions,

for example

- extended const syntax (for structures and arrays etc.)
- support for external procedures (contained in "modules")
- CHAR(n) data type
- string data type
- static definitions
- initializations of variables and statics
- OTHERWISE on CASE
- BREAK, CONTINUE, RETURN
- direct read and write of scalar types (enums)
- with clause on record types
- pointers on variables which are not on the heap
- pointer arithmetic
- many new builtin procedures and functions

A short "language reference" (23 pages), which covers the differences to 
"standard" Pascal,

is available from this website: http://bernd-oppolzer.de/job9.htm
see the link on top of the page, just below the picture.

Kind regards

Bernd

___
fpc-other maillist  -  fpc-other@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-other


Re: [fpc-other] Interpreter speed

2023-04-29 Thread Bernd Oppolzer via fpc-other

On the same machine:

PASCAL1 compiled with Stanford Pascal, running on the Hercules emulator:

pp pascal1
EXEC PASCAL PASCAL1
STATE PASCAL1 PASCAL A
EXEC PASCOMP PASCAL1
    STANFORD PASCAL COMPILER, OPPOLZER VERSION OF 2023.03 

    Compiler Summary 
     No Errors, no Warnings.
      26042 LINE(S) READ,  243 PROCEDURE(S) COMPILED,
      75126 P_INSTRUCTIONS GENERATED,   9.40 SECONDS IN COMPILATION.
EXEC PASC370 PASCAL1
    STANFORD PASCAL POST-PROCESSOR, OPPOLZER VERSION OF 2023.03

     NO ASSEMBLY ERROR(S) DETECTED.
     265999 BYTES OF CODE GENERATED,   8.24 SECONDS IN POST_PROCESSING.
EXEC PASLINK PASCAL1
Ready; T=17.71/27.02 18:17:06

Kind regards

Bernd



Am 29.04.2023 um 18:15 schrieb Bernd Oppolzer:
This may be slightly off-topic, but I can tell you some facts about my 
Stanford Pascal compiler,

which runs

- native on z Mainframe machines (which may count as a RISC machine, 
given the instruction set used)

- emulated by Hercules, which is a emulator of z Mainframes
- by emulating P-Code, which is a (sort of) byte code for Pascal ... 
the P-Code (which is pure text, portable)
is translated to a byte code representation before execution; static 
linking is also done in this stage


The times for compiling the compiler (first pass, 26.000 lines) are as 
follows:


- native on z machine: 0.1 seconds
- emulating z by Hercules using a very old z operating system: 10 to 
12 seconds
- similar (10 to 15 seconds), when running on Windows and emulating 
the P-Code;
this includes the time to translate the P-Code char representation ... 
depending on the

power of the used laptop etc., of course.

This makes a factor of 100 for the two emulation strategies, compared 
to the native execution.


The times are CPU times as reported by the builtin function CLOCK.

Windows example:

c:\work\pascal\work\src>pp pascal1

PCINT (Build 1.0 Jun 15 2022 08:21:21)

    STANFORD PASCAL COMPILER, OPPOLZER VERSION OF 2023.03 

    Compiler Summary 
     No Errors, no Warnings.
      26058 LINE(S) READ,  243 PROCEDURE(S) COMPILED,
      75130 P_INSTRUCTIONS GENERATED,  13.55 SECONDS IN COMPILATION.

*** EXIT Aufruf mit Parameter = 0 ***

HTH, kind regards

Bernd

http://bernd-oppolzer.de/job9.htm


Am 28.04.2023 um 09:20 schrieb Adriaan van Os via fpc-other:

Out of curiosity — has anybody compared the speed of

1. interpreting a parsed syntax tree, versus
2. interpreting byte code, versus
3. interpreting a RISC CPU ?

Regards,

Adriaan van Os

___
fpc-other maillist  -  fpc-other@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-other

___
fpc-other maillist  -  fpc-other@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-other


Re: [fpc-other] Interpreter speed

2023-04-29 Thread Bernd Oppolzer via fpc-other
This may be slightly off-topic, but I can tell you some facts about my 
Stanford Pascal compiler,

which runs

- native on z Mainframe machines (which may count as a RISC machine, 
given the instruction set used)

- emulated by Hercules, which is a emulator of z Mainframes
- by emulating P-Code, which is a (sort of) byte code for Pascal ... the 
P-Code (which is pure text, portable)
is translated to a byte code representation before execution; static 
linking is also done in this stage


The times for compiling the compiler (first pass, 26.000 lines) are as 
follows:


- native on z machine: 0.1 seconds
- emulating z by Hercules using a very old z operating system: 10 to 12 
seconds
- similar (10 to 15 seconds), when running on Windows and emulating the 
P-Code;
this includes the time to translate the P-Code char representation ... 
depending on the

power of the used laptop etc., of course.

This makes a factor of 100 for the two emulation strategies, compared to 
the native execution.


The times are CPU times as reported by the builtin function CLOCK.

Windows example:

c:\work\pascal\work\src>pp pascal1

PCINT (Build 1.0 Jun 15 2022 08:21:21)

    STANFORD PASCAL COMPILER, OPPOLZER VERSION OF 2023.03 

    Compiler Summary 
     No Errors, no Warnings.
      26058 LINE(S) READ,  243 PROCEDURE(S) COMPILED,
      75130 P_INSTRUCTIONS GENERATED,  13.55 SECONDS IN COMPILATION.

*** EXIT Aufruf mit Parameter = 0 ***

HTH, kind regards

Bernd

http://bernd-oppolzer.de/job9.htm


Am 28.04.2023 um 09:20 schrieb Adriaan van Os via fpc-other:

Out of curiosity — has anybody compared the speed of

1. interpreting a parsed syntax tree, versus
2. interpreting byte code, versus
3. interpreting a RISC CPU ?

Regards,

Adriaan van Os

___
fpc-other maillist  -  fpc-other@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-other

___
fpc-other maillist  -  fpc-other@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-other


Re: [fpc-other] Stanford Pascal Compiler successfully ported to Windows, OS/2 and Linux

2016-12-26 Thread Bernd Oppolzer


Am 26.12.2016 um 10:31 schrieb Alexander Stohr:



Am 2016-12-25 um 21:42 schrieb Bernd Oppolzer:

Thank you for your feedback.


Thank you for your kind answers.


You're welcome; I'm happy to meet someone who is interested in my work :-)



BTW, I had to remove some sort of self check from the
Stanford Pascal runtime; it checked (before), that pointers only pointed
to the valid range of heap addresses; because the "traditional" heap 
consisted

of one contiguous segment, this was very easy to implement. But I had to
remove this check, because pointers now can point to auto variables,
static variables, "new" alloc areas etc. as well.


It would still work for some applications.  But they would be few.
The vast majority of real use projects probably tend to break those box.
So if it was an option with an enable switch, most would disable it.
Probably for that reason it would never be worth keeping it at all.


The Pascal documents say that they implemented it in a very basic manner,
just to be sufficient for the needs of the compiler; and that future 
implementors

are free to replace the storage management by more sophisticated solutions.
The heap elements allocated by the compiler are - for example - all 
freed, when
a certain block is completely compiled, that is: the internal lists of 
definitions
are freed, because they are no longer needed.  I kept the mark/release 
logic,
because I wanted to keep this logic. The new/mark/release areas are 
completely
different from the alloc/free areas. The whole mark/release area is 
allocated

at startup and cannot be enlarged (4 MB minus stack at the moment, can
be configured). The alloc/free area is allocated on an as-needed base in 
64 k chunks;

limited to ca. 8 to 10 MB due to address range limitations - may grow much
larger, if 31-bit addressing were possible.

The problem, why I had to add this to the Pascal runtime, was: the 
original
Stanford Pascal runtime only had functions new, mark, and release. 
Release
releases all storage which was required since the last mark call - 
but there
is no invidual "free" of areas. But I wanted to port an application 
to Stanford
Pascal which required invidiual allocs and frees (like C 
malloc/free). I decided

to add alloc/free and leave new/mark/release untouched for the moment,
because it is used in the compiler.


So it was created more like a stack allocator. The allocations were 
local to the
functions or context they were allocated in and at some waypoint 
(exit/return)
all of them were released. Thats not a generic universal heap design 
but rather
goes to the level where gaps of sometimes named but then unused items 
increase over time
and on persistent would need some garbage collection (time loss!) any 
now and then.
But for that design as above growth and shrinkage is determined by the 
code path.

Under some conditions the growth might be very determined
but the shrinkage is always very fixed whilst beeing much rarer.
(I feel a little similarity to older stack rewinding concepts in C 
exception/resume features.)


I see you did wise to keep those items out of yournew codes for the 
project
whilst keeping it untouched for the moment in the existing codes that 
dont interfere.


Do you see a good chance to use some larger existing code bases and 
test suites for verifying the compiler?
Do you have some heap tracking functionality inside so that e.g. "1234 
heap Bytes lost" it printed at exit?

Is there a some debug option for stack max size tracking?
Is there something for stack/heap object out of bounds writes/access? 
(thinking of magic word fences in between, and of heap sanity checking)


The first test for the compiler is always the compiler itself (first and 
second pass);
it should compile itself again and again and yield the same results. 
Then I collected
over the time some 30 testcases, which cover different areas; 
esspecially the new
features that I added. I am a big fan of test driven development, so I 
often added new
statements and features which first lead to a compiler error, and then I 
implemented
them, until they worked as the should. Now these test cases are kept for 
regression testing.


For the LE heap management, which is a sort of addendum to the Pascal 
runtime

(the compiler doesn't need it):

there are functions that give statistics on heap usage at the end of the 
process or

at any point in time in between

there are functions that check the heap for integrity (same checks as 
suggested by
the IBM paper - the LE heap management technology is a product of IBM 
Watson

Research Center, see the presentation link some days ago)

I wrote a program to check for memory leaks (in ANSI C), which works 
with the
"normal" LE heap management (as provided by IBM); you call this program 
twice
at different points in time, and the program tells you, which areas have 
been
allocated and not freed in the meantime; this was very helpful when

Re: [fpc-other] Stanford Pascal Compiler successfully ported to Windows, OS/2 and Linux

2016-12-24 Thread Bernd Oppolzer


Am 24.12.2016 um 12:50 schrieb Mark Morgan Lloyd:

On 24/12/16 11:30, Bernd Oppolzer wrote:


chars in
the (character) P-Code file had to be converted to character 
constants; all

places where character A - for example - was represented as numeric 193
(which is EBCDIC 'A') had to be found and corrected. Even such places 
where
the reference to 193 was not recognized at first sight, that is: 
offsets in

branch tables and bit strings representing sets.


I think you've made creditable progress in a difficult area. What are 
you doing about PRED() and SUCC() as applied to CHARs?


Anybody with any sort of interest in mainframes is going to have to 
consider EBCDIC for quite some while, but unfortunately there are 
still people who insist that it's flawless. One of our wiki pages has 
somebody confirm that EBCDIC has ^, but he then goes on to admit that 
it's not in all codepages...



Thank you.

I think about "portability" in a certain way; to make it clear:

of course it is possible to write programs that are not portable
using my "new" compiler.

You are mentioning SUCC and PRED with CHAR; that is a very cood example.
These functions are implemented based on the underlying character set;
that means, that SUCC('R') is not 'S' on EBCDIC, because there is a gap 
between

'R' and 'S in the EBCDIC codepage (six other characters between 'R' and 'S'
which are not alphabetic).

This is a portability problem which appears at the source code level (!)
and cannot be healed by the compiler. It is the same with the C language,
and the sensible programmer has to deal with this, if he or she wants to
have his or her programs really portable.

My problems with the Stanford compiler were different; if the compiler 
generates
code which will not run on a platform using a different code page, 
because it generates
branch tables when implementing case statements that imply a certain 
code page,
this is a big problem and has to be fixed. The compiler implementor has 
to find a
representation (in the P-Code, in this case), which will work on every 
platform, that is:
which is independent of the code base - and does not prevent the 
optimizations

done by the later stages of the compiler.

Same goes for the bit string representation of sets of char; in this 
case, the construction

of the bit string has to be deferred until the code page can be determined
(P-Code translation or interpretation time). On Windows etc., the P-Code 
interpreter
"translates" the P-Code to an internal representation on startup, and 
that's the
time when the "portable" representation of set constants (of char) are 
translated

to the bit string representation. See my web site for details.

Regarding ^:

"my" compiler supports different representations for the pointer symbol, 
and for other

critical symbols, too:

^  @  ->   for the pointer symbol (I use -> most of the time)
[   (.   (/   for arrays
{   (*  /*   for comments  ("comment" is supported, too, for historic 
reasons)


no problem with EBCDIC. I do the editing on Windows most of the time and 
move the

files to Hercules using the socket reader.

BTW: you find the compiler sources and a history of the extensions that 
I applied in the last

months (or years) on the web site, too.

Kind regards

Bernd

___
fpc-other maillist  -  fpc-other@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-other


[fpc-other] Stanford Pascal Compiler successfully ported to Windows, OS/2 and Linux

2016-12-23 Thread Bernd Oppolzer


Hello FPC list,

I would like to inform you, that I ported an improved version
of the Stanford Pascal compiler (a descendant of the Wirth P4
compiler) to Windows, OS/2 and Linux.

I improved this compiler, which comes from the IBM mainframe,
by adding several features that I needed and that I found useful,
in the past few months.

Then I started to write a P-Code interpreter on Windows, which
interprets the generated P-Code; while doing this, I discovered
some severe portability issues, which required additional changes
in the compiler, for example: branch tables which were based on
the EBCDIC char sets; difficulties with sets of char or
sets of subranges of chars and so on. For all those problems,
I found portable solutions, that is: extensions or changes
to the P-Code, that eliminated those dependencies on the
code page of the target machine.

Now the compiler runs with identical results on all the target
platforms, including IBM mainframe; the generated P-Code files
may be transferred freely between the platforms. Running them
on the different platforms will yield the same results.

On the mainframe, the P-Code is translated to machine instructions;
on all the other platforms, the P-Code is interpreted. A true
P-Code translator on the non-mainframe platforms may follow later.
Maybe the P-Code to machine code translater from the mainframe
can be used as a starting point to do this, we will see; it does
some significant optimizations which are IMO indipendent of the
target machine's machine code.

BTW: the P-Code interpreter is written in ANSI-C.

Because I am working with an emulated mainframe (Hercules),
the interpreted compiler on Windows is in fact faster than
the machine code based compiler on Hercules (which will be not true
on a real IBM machine, of course). The compiler compiles itself
in 2 to 3 seconds on both platforms (using a laptop running
Windows 10 or the Hercules emulator; the laptop is not brand new).

You find some stories about my efforts of the last years and
months on my web site:

http://bernd-oppolzer.de/job9.htm

There are still some parts of the language missing on Windows etc.,
for example binary files (files, which are not files of char),
and modules (that is, linking of seperately compiled program units).
This will be added in the next weeks and months.

If you want to know more about this project, feel free
to contact me offline.

Kind regards,
merry Christmas and a happy new year

Bernd


___
fpc-other maillist  -  fpc-other@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-other