On 10 okt 2006, at 13:46, Florian Klaempfl wrote:

Practical argument:
the assembler code _is_ better for the code I tested and up to as twice as fast
as the original one.

That's indeed true for an extremely small function with so few local variables/parameters that even on an i386 it doesn't need spilling in the absence of register allocation optimizations. That's why I said it *may* currently degrade performance in more complex functions (it also may not, I really don't know, it was just a remark).

Theoretical argument:
- using the tail goto you've one function with one set of variables being active
across the the whole function
- using a recursive call you've at least two sets of variables: the caller and the callee ones. Though one set is only active at a limited part of the function, the set of the caller is still in use while the callee is called
though they (the caller variables) are spilled.

The first point is just as much a downside as an upside in the current situation, because parameters/variables which are used without being destroyed by some function call can normally be put in a (reusable) volatile register. Now they need a non-volatile register during the entire function.

The fact that you get such a speedup is in my view mainly an indication of the fact that the function barely does anything, and that almost half the time is spent in setting up and tearing down stack frames. So it's logical that register allocation has little influence and that optimizations which remove this stack frame logic help a lot.

I guess it could be used to change to code above to
leal 1(%esi),%eax
?

Yes.

Or is this slower on some CPUs?

I don't know.


Jonas
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to