Dan Sugalski wrote:
> We might want to have one fast and potentially big loop (switch or computed 
> goto) with all the alternate (tracing, Safe, and debugging) loops use the 
> indirect function dispatch so we're not wedging another 250K per loop or 
> something.

Absolutely. There's no gain from doing computed goto for those anyway
because the per-op overhead makes direct threading impossible. Brent Dax
already posted an example of why this is bad.

Function calls are not slow. It's the extra jumps and table lookups
that are slow. If a mode has extra over-head it won't see any advantage
with computed goto over function calls. (At least this is what I've
found testing on Pentium III and Athlon. Most RISC systems should see
similar effects. Older CISC systems with slow function calls may be a
different story.)

BTW, 250K for the size of the inlined dispatch loop is way too big. The
goal should be to put the hot ops inline and leave the other ones out.
Ideally the dispatch loop will fit into L1 cache -- maybe 8k or so. IMHO
we'd be a lot better inlining some of the PMC methods as ops instead of
trig functions. ;)

- Ken

Reply via email to