On 10 okt 2006, at 13:46, Florian Klaempfl wrote:
Practical argument:
the assembler code _is_ better for the code I tested and up to as
twice as fast
as the original one.
That's indeed true for an extremely small function with so few local
variables/parameters that even on an i386 it doesn't need spilling in
the absence of register allocation optimizations. That's why I said
it *may* currently degrade performance in more complex functions (it
also may not, I really don't know, it was just a remark).
Theoretical argument:
- using the tail goto you've one function with one set of variables
being active
across the the whole function
- using a recursive call you've at least two sets of variables: the
caller and
the callee ones. Though one set is only active at a limited part of
the
function, the set of the caller is still in use while the callee is
called
though they (the caller variables) are spilled.
The first point is just as much a downside as an upside in the
current situation, because parameters/variables which are used
without being destroyed by some function call can normally be put in
a (reusable) volatile register. Now they need a non-volatile register
during the entire function.
The fact that you get such a speedup is in my view mainly an
indication of the fact that the function barely does anything, and
that almost half the time is spent in setting up and tearing down
stack frames. So it's logical that register allocation has little
influence and that optimizations which remove this stack frame logic
help a lot.
I guess it could be used to change to code above to
leal 1(%esi),%eax
?
Yes.
Or is this slower on some CPUs?
I don't know.
Jonas
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel