Re: Big SMP

Robert G. Brown Thu, 3 Dec 1998 16:18:57 -0500
On Wed, 2 Dec 1998, Emil Briggs wrote:

> >
> >more in price/performance advantage over e.g. an SP2.  If the
> >calculations are fine grained (e.g. hydrodynamics, galactic evolution,
> >things with long range forces and hence IPC's) then a beowulf still
> >might be the way to go but the cost/benefit is less and you'll have to
> >work harder to get the benefit.  There are definitely problems for which
> >the T3E is a better solution than a beowulf, although I personally
> >believe that this will change drastically in the next two years.
> >
> 
> The T3E is still an order of magnitude or so better with respect to
> communications bandwidth and latency than Myrinet or Gigabit Ethernet.
> I don't think that you'll be able to get that sort of performance
> in a Beowulf and still preserve the price/performance ratio in a
> two year time frame. (You may be able to get the performance but I
> imagine it will cost a bundle).

You could be right.  However, I don't think that you will be, because I
expect new hardware to be introduced fairly shortly that will shrink the
margin.  Bear in mind that a lot of manufacturers are seeing an
opportunity for substantial profit in being the first to bring a "real"
commodity intersystem communication channel to the market.  Right now
beowulves are built on commodity networks (where Myricom itself claims
that its product is a commodity) because they are cheap and plentiful,
not because they are ideal for the purpose.  However, beowulves have
proven a tremendous market and even established price points that any
manufacturer can look at and say, "Hmmm, thar's Gold in them there
hills".  Myricom alone has proven that there are plenty of people who
will pay (almost) as much for an INCH (Internode Channel) as they will
for a node and for good reason.

So I'd look for next-generation commodity INCH's to become available in
less than a year, and to become "affordable and debugged" in less than
two.  Myricom even has a schedule for delivery of at least 2.5 Gbps on
64/66 PCI adapters by sometime next year, and they are very unlikely to
remain the only player in this particular game.  I'm not certain, but I
don't think the biggest obstacles at this point are the actual
communications technology (which can be nicely parallelized to nearly
any total bandwidth you like) -- rather they are limitations of the e.g.
PCI bus or memory bus on the host systems.  Look for the major CPU/MoBo
players to develop a new CPU/memory bus interconnect that talks directly
to an INCH and/or permits an INCH to read and/or write memory without
any CPU involvement or interrupts at all.

Of course by the time all this plays out the T3-whatever may be out and
may preserve the current landscape, but I think that even then the
current performance margin will be cut by a factor of three or more
(half an order of magnitude, if you like) at constant cost.  And it
could be more.

This is indeed relevant to SMP and scaling issues within linux (in case
it sounds like I'm on a beowulfish off-topic chant).  There is now, and
will continue to be, a certain (healthy) tension between shared memory
multiprocessor systems and message-passing multinode systems.  There is
an interesting talk presented by Charles Seitz (president of Myricom)
this summer at the COTS Symposium that lays out Myricom's view --
message passing on commodity interconnects is good and will define the
future of high-performance computing because of scaling limitations on
shared memory/CC-NUMA and the inherent parallelism of message passing.

Now, I'm not about to get into a religious war -- I just love to take
advantage of the superior technology that comes after all the high
priests have finished duking it out.  What I will say is that at some
time in the next couple of years I am HOPING that linux-smp evolves into
linux-dsmp, where the "d" stands for "distributed".  As INCH technology
progresses up to native bus speed in raw bandwidth (probably still a
factor of 3 worse in latency, but what the hell) it will make increasing
sense to incorporate the channels directly into the kernel and build a
"true" distributed symmetric multiprocessing kernel.  This will become
overwhelmingly true if they make the local IPC channels conform in some
way to an INCH -- building an SMP system even in a single box out of
"nodes" connected by "INCH"'s conforming to a commodity standard.  This
will be the ultimate in scalability -- an "SMP system" and a "Beowulf"
will have merged and we will really need an operating system that scales
transparently from a single node/CPU to 1000 nodes/CPUs.

I'm NOT saying that just any software will run in parallel on such a
system, but it may one day be possible for an ISP whose "server" is
getting loaded to literally buy a box, connect it as a node, and reboot
with one box's worth of additional multitasking capacity.  And those of
us who DO have parallel software will have the job of writing it greatly
simplified by an e.g. MPI to the INCH itself.

Dreams?  Crazy?  We'll see....

   rgb

Robert G. Brown                        http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:[EMAIL PROTECTED]



-
Linux SMP list: FIRST see FAQ at http://www.irisa.fr/prive/mentre/smp-faq/
To Unsubscribe: send "unsubscribe linux-smp" to [EMAIL PROTECTED]
Re: Big SMP

Reply via email to