On Mon, 2025-02-17 at 09:17 -0500, Paul Koning via cctalk wrote: > Also multiple functional units, seriously interleaved memory, and a > bucket full of other tricks. The way loads and stores are requested > by the programmer naturally makes them background operations, and the > "stunt box" handles that background process.
I remember the "stunt box" also being called the "traffic cop." The Denelcor HEP had asynchronous memory access. It had several (sixteen, IIRC) functional units and hardware thread switching. When a memory access occurred, the register file was saved (or maybe there were more register files than hardware processors — my memory is foggy here) and another thread was put into a functional unit. When the memory access was completed, the register file was put into the functional unit queue. Arvind at MIT was a dataflow investigator. Greg Papadopolous (later Chief Technical Officer at Sun Microsystems) developed the Monsoon tagged-token dataflow computer as his PhD project. Rishiyur Nikhil described a RISC architecture that was augmented with asynchronous memory access and automatic thread switching. Burton Smith and James Rottsalk founded Tera Computing to develop a computer called the Multi Tread Architecture or MTA, IIRC based on Nikhil's ideas. They bought the ashes of Cray and promptly changed their name to Cray. But the MTA had a fatal flaw that neither Tera nor the Cray engineers they absorbed were able to resolve: It had a 100 MHz bottleneck. Even so, it was faster than the "supercomputer" that IBM was offering at the time — but only on a sort benchmark.
