On Sun, Nov 27, 2005 at 02:20:14PM -0800, J. Andrew Rogers wrote: > SMP is message passing where the distributed transaction semantics > are implemented in low-latency hardware. You can implement the same
SMP has implicit memory sharing. It tries to reduce access latency by introducing caches, which result in a coherency problem. You need to prove that multiple writes to the same location don't occur. This wouldn't be a big problem in a read-only scenario. Both caching and SMP introduce problems in an attempt to solve a non-problem (multiple writes to the same location can't occur in a message-passing environment with separate address spaces). With current technology, an MPI call in an SMP environment is much faster than an MPI call over a signalling fabric (some 100 ns vs. some ~us). However, with SCI on-die or a switched Hyperchannel fabric a message passing call has the same cost as with SMP -- with the difference that it does not require caches and does not require solving the cache coherency problem. I don't know how exactly this scales, but I strongly suspect O(N*N), and each gate delay adds to each memory access. This also strongly runs contrary to data locality, which is a must at current clock rates. Preferrably, you want your data sitting right next to the CPU, or even better, right in the CPU. > thing in MPI, you just have to do the distributed transactions > explicitly in software and the time required to certify a transaction Message passing can be done in hardware. It doesn't have to be a full MPI monty. > is higher. The primary difference is that multithreading APIs are > designed around the assumption that transaction certification is > extremely fast and therefore inexpensive, allowing the programmer to > use the implied transactions far more often than may be strictly > necessary for the sake of convenience. Cache misses and pipeline stalls are increasingly breaking the assumptions under which current (frequently hardware-agnostic) developers operate. > This is more obvious with SMP systems with 10^3 processors, where the > performance characteristics are virtually identical to an MPI system > of the same size with a high-quality interconnect. The higher price A good message-passing system should outperform SMP at 32-64 CPU system size. > tag of SMP buys you the same API as small SMP systems, but you will > still have to program it as though it were MPI because the > assumptions of standard threading APIs are not really true on that > scale, making it far from obvious what the value proposition of > extremely large SMP is exactly. SMP encourages moving data around > while MPI encourages moving metadata around, the inefficiency of the > former becoming more obvious and harder to program around when you > run out of bandwidth. Telling another processor what you want it to > do with its local memory is often much cheaper than slurping big data If you access a remote location, this results in a message being sent there (request) and a result being relayed back to you. Global memory is an illusion. Everybody expects that the clock is global, and the word everybody is writing to at the same time is left in a consistent state. This is hard work, takes hardware and time (this operates on all memory accesses, so you always have to pay the penalty). > structures back and forth across the bus, wrapped in distributed > transaction semantics. > > As a tangent, I find the new Sun Microsystems chip to be an > interesting random step in the right direction. Lots of simple 64- > bit cores connected to loads of fast memory and I/O channels. If I > were designing it, I might have only put a single core on the chip > and turned the rest of the silicon into very fast local memory with > plenty of I/O to other processors. Sun obviously has different needs. I'd rather have only on-die and multiple serial lines with packet-switched fabric, too. -- Eugen* Leitl <a href="http://leitl.org">leitl</a> ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.leitl.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE ------- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
signature.asc
Description: Digital signature
