On 8/18/2010 1:32 PM, Paul Rubin wrote:
Elizabeth D Rather<erat...@forth.com>  writes:
Processors seldom could multitask, so it wasn't recognized that the
stack could be a performance bottleneck
Lol.  Forth supported multitasking on every processor it was
implemented on in the 70's, with blazing speed compared to competitive
techniques. I have never seen stack operations to be a bottleneck.

I think "multitasking" in that post refers to superscalar execution,
which wasn't done in the 1970's except on supercomputers.  That the
stack is a bottleneck is the precise reason that optimizing Forth
compilers do complicated flow analysis to translate stack operations
into register operations.

   Some small FORTH machines had dedicated stack hardware.  On each
CPU cycle, the CPU could do one stack access, one main memory access,
and one return stack access.  This was before cacheing; those CPUs
were slow relative to their memory, so a non-cached one-instruction-per-clock machine made sense.

   In the superscalar era, there's not much of an advantage to avoiding
stack accesses.  x86 superscalar machines have many registers not
visible to the program, as the fastest level of cache.  In practice,
the top of the stack is usually in CPU registers.   The "huge number
of programmer-visible register" machines like SPARCs turned out to be
a dead end.  So did making all the instructions the same width; it
makes the CPU simpler, but not faster, and it bulks up the program
by 2x or so.

                                        John Nagle

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to