Re: [polyml] How to build a new backend?

2023-11-24 Thread David Matthews
The build system uses PolyML.make not the usual make system.  There is a 
short description of it here: 
https://polyml.org/documentation/Reference/PolyMLMake.html .  The 
backends are built as a result of this line in 
mlsource/MLCompiler/CodeTree/ml_bind

structure GCode = GCode
This looks for a file called GCode. where  is an extension such 
as .ML or .sml .  An undocumented feature is that it first looks for 
files using the current architecture as returned from the RTS by 
PolyML.architecture() after converting this to lower case.  So on the 
X86_64 it finds GCode.x86_64.ML and uses that.  There are various 
GCode. files for the different architectures.  To add a new 
architecture you need to add a new string to the poly_dispatch_c and the 
appropriate GCode.foo.ML file.


A further complication is that Poly/ML does not work in the same way as 
a more conventional compiler which compiles a file and write the object 
code for that file to the file system.  Instead the compilation process 
builds a data structure much of which is executable code but including 
other values.  At the end of the build process the structure is 
"exported" and that writes the object file.  As PolyML.make runs each 
expression is compiled and immediately evaluated meaning that some of 
the code produced by the compiler is run immediately.  Obviously if the 
evaluation produces a function that function is only evaluated when it 
is actually called.  This make conventional cross-compilation difficult 
or impossible.  The bootstrap process, which starts with interpreted 
code and ends up with machine code, has to work around this.  It does it 
by first building a version of the interpreted code that has additional 
instructions in the code.  These instructions are machine instructions 
on the target architecture that switch to the interpreter but are 
treated as no-ops by the interpreter itself.  In this way during the 
next stage of bootstrap machine code functions can call interpreted code 
functions and vice versa.  When the bootstrap is complete all the 
interpreted code is discarded.


David

On 23/11/2023 15:07, Andrei Formiga wrote:

Hi David,

Thank you for your answer. You're right - I have to understand a lot more
in order to be able to create a new backend. I may have many more
questions.

I guess the first one is: for a rebuild of the compiler (from the last
bootstrap stage, not from scratch), how does the build system find out
which files to compile, and in what order?

On Thu, Nov 23, 2023 at 4:12 AM David Matthews <
david.matth...@prolingua.co.uk> wrote:


Hi Andrei,
It would be interesting to have another back-end but I really don't
think what you are suggesting is feasible.  There are currently three
back-ends: native code for the X86(32/64), native code for the ARM64 and
byte code.  The byte code is interpreted by part of the run-time system
and is used on architectures other than the X86 and ARM64 but it is also
used during the initial bootstrap on the X86 and ARM64.

Apart from a small amount of architecture-specific code, and of course
the interpreter in C++ for the byte code, all these back-ends make use
of the same run-time system support.  The run-time system is intimately
bound up with the ML part of the system.  They share a common view of
how values are represented: short integers are tagged, addresses are not
tagged, strings have a length word followed by byte data etc.  Any new
back-end has to maintain these representations.  Before you even think
about writing a new back-end you need to understand how all this works.

David




___
polyml mailing list
polyml@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

___
polyml mailing list
polyml@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

Re: [polyml] How to build a new backend?

2023-11-23 Thread Andrei Formiga
Hi David,

Thank you for your answer. You're right - I have to understand a lot more
in order to be able to create a new backend. I may have many more
questions.

I guess the first one is: for a rebuild of the compiler (from the last
bootstrap stage, not from scratch), how does the build system find out
which files to compile, and in what order?

On Thu, Nov 23, 2023 at 4:12 AM David Matthews <
david.matth...@prolingua.co.uk> wrote:

> Hi Andrei,
> It would be interesting to have another back-end but I really don't
> think what you are suggesting is feasible.  There are currently three
> back-ends: native code for the X86(32/64), native code for the ARM64 and
> byte code.  The byte code is interpreted by part of the run-time system
> and is used on architectures other than the X86 and ARM64 but it is also
> used during the initial bootstrap on the X86 and ARM64.
>
> Apart from a small amount of architecture-specific code, and of course
> the interpreter in C++ for the byte code, all these back-ends make use
> of the same run-time system support.  The run-time system is intimately
> bound up with the ML part of the system.  They share a common view of
> how values are represented: short integers are tagged, addresses are not
> tagged, strings have a length word followed by byte data etc.  Any new
> back-end has to maintain these representations.  Before you even think
> about writing a new back-end you need to understand how all this works.
>
> David
>

-- 
[]s, Andrei Formiga
___
polyml mailing list
polyml@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

Re: [polyml] How to build a new backend?

2023-11-22 Thread David Matthews

Hi Andrei,
It would be interesting to have another back-end but I really don't 
think what you are suggesting is feasible.  There are currently three 
back-ends: native code for the X86(32/64), native code for the ARM64 and 
byte code.  The byte code is interpreted by part of the run-time system 
and is used on architectures other than the X86 and ARM64 but it is also 
used during the initial bootstrap on the X86 and ARM64.


Apart from a small amount of architecture-specific code, and of course 
the interpreter in C++ for the byte code, all these back-ends make use 
of the same run-time system support.  The run-time system is intimately 
bound up with the ML part of the system.  They share a common view of 
how values are represented: short integers are tagged, addresses are not 
tagged, strings have a length word followed by byte data etc.  Any new 
back-end has to maintain these representations.  Before you even think 
about writing a new back-end you need to understand how all this works.


David

On 22/11/2023 18:39, Andrei Formiga wrote:

Hello,

Let's say I want to create a new backend for Poly/ML, generating code for a
virtual machine, for example the Erlang BEAM Virtual Machine. The backend
would not depend on the C++ runtime, because the runtime would be the
virtual machine.


From looking at the source code, the backend should generate target code

from the CodeTree. I could add a new directory to MLCompiler/CodeTree with
the new backend code, and add a GCode.beam.ML file to the MLCompiler
directory that instantiates the GCode structure correctly.

My question is about building the compiler with the new backend enabled.
How would I add the new backend to the build and then build the compiler
with this backend enabled? I have limited experience with autoconf and
automake (mainly as a user), so taking a look at configure.ac and
Makefile.am didn't give me clues on how to tie everything together. I also
see that there is a RootXX.ML file for each target architecture that
compiles the source files in order, but a comment in these files say they
were generated from the make files.

So any tips or pointers are appreciated.


___
polyml mailing list
polyml@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

___
polyml mailing list
polyml@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml