On Nov 27, 2005, at 1:07 PM, Eugen Leitl wrote:
Absolutely. Multithreading (SMP) approach doesn't scale. Illusion of a
global memory is a difficult one to maintain in a write-intesive
context
and won't go beyond 16-64 cores. If you want to have >10^3
parallelism,
you have to go message passing. There are no alternatives to that.
SMP is message passing where the distributed transaction semantics
are implemented in low-latency hardware. You can implement the same
thing in MPI, you just have to do the distributed transactions
explicitly in software and the time required to certify a transaction
is higher. The primary difference is that multithreading APIs are
designed around the assumption that transaction certification is
extremely fast and therefore inexpensive, allowing the programmer to
use the implied transactions far more often than may be strictly
necessary for the sake of convenience.
This is more obvious with SMP systems with 10^3 processors, where the
performance characteristics are virtually identical to an MPI system
of the same size with a high-quality interconnect. The higher price
tag of SMP buys you the same API as small SMP systems, but you will
still have to program it as though it were MPI because the
assumptions of standard threading APIs are not really true on that
scale, making it far from obvious what the value proposition of
extremely large SMP is exactly. SMP encourages moving data around
while MPI encourages moving metadata around, the inefficiency of the
former becoming more obvious and harder to program around when you
run out of bandwidth. Telling another processor what you want it to
do with its local memory is often much cheaper than slurping big data
structures back and forth across the bus, wrapped in distributed
transaction semantics.
As a tangent, I find the new Sun Microsystems chip to be an
interesting random step in the right direction. Lots of simple 64-
bit cores connected to loads of fast memory and I/O channels. If I
were designing it, I might have only put a single core on the chip
and turned the rest of the silicon into very fast local memory with
plenty of I/O to other processors. Sun obviously has different needs.
J. Andrew Rogers
-------
To unsubscribe, change your address, or temporarily deactivate your subscription,
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]