Op Sun, 18 Mar 2012, schreef peter green:
Daniël Mantione wrote:
Please use the command line option -sr to check the generated code before
register allocation.
Done and attatched.
You can likely find the cause in there.
The code with imaginary registers looks correct to me. It seems to load each
parameter into a seperate even numbered imaginary register (using odd
numbered imaginary registers as temporaries in the process) allocating them
as it goes. Then copies them to the locations needed for the function call
deallocating them as it goes.
I agree, both imaginary registers are correctly allocated and freed.
So it seems to me that the problem is in the translation of the form using
imaginary registers to the form using real registers. Can you explain (or
point me to documentation on) how this form is translated into a form using
real registers.
The algorithm used is called "iterated register coalescing", an advanced
form of graph colouting and was designed by Andrew W. Appel. He describes
in detail in his book "Modern Compiler Implementation in C".
Basically the registers are put into "worklists", and then 4 procedures:
* simplify -> takes registers out of the graph that can safely be coloured
* coalesce -> tries to coalesce registers together to reduce pressure
* freeze -> selects registers that have a chance to be coalesced and
should therefore not be spilled yet
* spill -> selects a register that should move to memory.
These procedures are called iteratively until the worklists are empty.
Then the graph is coloured.
In particular what exactly happens when there are not enough
real registers free to assign a real register for every imaginary register
that is in use at a given time?
Then a register is spilled, i.e. replace by a location in memory. This may
be possible without new registers:
mov ireg30d,ireg29d -> mov ireg30d,[ebp-40]
... but in some cases a help register is needed:
mov ireg30d,[ebp+20] -> mov ireg99d,[ebp+20]
mov [ebp-40],ireg99d
Should a new register is needed, register allocation is
completely restarted with the new code.
My current suspiscion is that something is missing regarding handling of
running out of VFP registers and it hasn't been noticed before because noone
has tried to do what i'm doing (implementing a calling convention using VFP
registers and then stress testing it) but i've no idea where to look in the
sourcecode to confirm/refute that idea.
It doesn't look like spilling happens in your example: I don't see moves
from/to the stack frame as temporary location.
Perhaps the first thing to check is to add a breakpoint in
Tinterferencebitmap.setbitmap.
The "versions" of s20 use superregister 50 and 70, so a setbitmap (50,70)
or (70,50) should be called at some point to tell the register allocator
both registers are active at the same time and cannot be coalesced.
Daniël
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel