Benjamin Goldberg wrote:
How many "save"s does it take to be to be slower than one "push"?
This really depends on the architecture, the running core and so on. But Dan estimated a cutoff value of 3, this test program indicates a cutoff of 2:
For me it was something like 2.37 push = 1 save.
Saves also have the advantage of, on some architectures (like SPARC), not polluting the cache. Doing the pushes dirties your L1 & L2 caches for both the source and destination, while the save doesn't, though the registers are likely already in L2, if not L1, cache. SPARC has a cache-bypassing memcpy, which is kind of cool. While you might think that's a bad thing, you'd actually be incorrect.
--
Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk