Andre van Dun wrote:
>Please let the light shine on dynamic recompilation, how does this work in
>practice ?
Ok...
Dynamic Recompilation is a technique which is also called Binary Code
Translation. The idea is to translate a (part of a) program of one system
to the language of the target system. In this case, translate Gameboy code
into MSX code.
The ideal binary translation would be to take for instance a Gameboy ROM
and translate it into an MSX executable (e.g. a COM-file). However, this
kind of binary translation, also called Static Recompilation, is impossible
in practice.
In Dynamic Recompilation you translate a program piece by piece, during
run-time. The translated piece of code is stored in memory so the next time
this particular piece of program is executed there's no need for
translating it. Let's use an example:
1. the Gameboy ROM code calls a routine at address 2000h.
2. the emulator checks if the routine at 2000h has been translated.
2a. if so, execute the translated routine.
2b. if not, translate the routine, store it, and execute it.
The translator usually stops translating when a RET or JP is reached.
Actually GEM stops at RET/CALL/JR/JP conditional or unconditional.
So what you need is a block of memory to store the translated routines, a
table with whether or not a certain routine has been translated and a
pointer to the translated code. Obviously A LOT of memory.
The first implementation of a DynaRec-engine in GEM didn't have very
promising results. In order to minimize memory usage I used a 4kB
cache-table and 12kB code-cache in each memory segment. That meant a game
needed 2x ROM-size and not more.
But for each segment there are 16384 addresses. For each of the addresses I
had to know whether or not the code from this address had been translated
already. In order to fit the table in 4kB I made each table-entry
correspond to 16 bytes. 16384 / 16 = 1024 * 4-bytes per entry = 4kB table.
The first 2 bytes contained the actual address which the entry was
referring to and the second 2 bytes contained a pointer to the translated code.
So if code from address 4000h was translated, but later code from 400Fh was
called, the table-entry would have already been taken so the 400Fh code
would be interpreted.
There was a lot of overhead to calculate the table-offset from the Program
Counter (15 clockcycles each time) and many instructions still needed
interpreting because of the 16-byte range in the cache-table. The speedgain
of this design was close to nothing. :(
The DynaRec-engine used in GEM v0.7 is more memory-hungry as it keeps 16kB
cache-tables and uses additional memory segments for storing the translated
code.
16kB tables mean that for every 4 bytes theres a table-entry. A big
improvement over the 16 bytes in the 1st design. Also the calculations
needed for obtaining the table-offset are much simpler. (twice as fast)
A table-entry looks like this:
+0 segment number of translated code (if 0, the entry is free)
+1 low-byte of the address related to this entry (because each entry
relates to 4 bytes in the gameboy code)
+2/3 pointer to the translated code
There are a couple of small differences between a Dynamic Recompiler
written for PC and the one in GEM.
1. on PC you can use larger cache-tables or use a hash-table with linked-lists.
2. on PC you can add self-modifying-code-detection. GEM avoids this problem
by interpreting all code run from Gameboy's RAM.
3. on PC you can abandon less called code to free up memory. It would mean
too large overhead for GEM to care about that. When memory is full, the
rest is interpreted.
4. on PC you can add loop-detection and other advanced code-analysis to
optimize the translated code.
Actually I was pretty amazed that after the failed 1st design I was able to
make GEM twice as fast! Personally I was not even sure it could be done,
but then again GEM amazed me a couple of times before already :)
In theory some nice code-analysis stuff can be added, but there's only so
much room in a 16kB segment ;) I'm glad Z80 code is so compact!
that's about it I think... I hope I have explained it clearly, but feel
free to ask some more questions :)
Greetz,
Patriek
--
For info, see http://www.stack.nl/~wynke/MSX/listinfo.html