On Tue, 11 Aug 2009 at 9:50pm, Robert G. Brown wrote

In a nutshell, the "cost of going cheap" isn't linear, with or without
student/cheap labor.  For small clusters installed by somebody who knows
what they are doing and e.g. operated and used by the owner or the
owner's lab including students, operated by departmental sysadmins with
cluster experience and enough warm bodies to have some opportunity cost
labor handy -- sure, go cheap -- if a node or two is DOA or fails, so
what?  It takes you an extra day or two to get the cluster going, but
most of that time is waiting for parts -- OC time is much smaller, and
everybody has other things to do while waiting.  But as clusters get
larger, the marginal cost of the differential failure rate between cheap
and expensive scales up badly and can easily exceed the OC labor pool's
capacity, especially if by bad luck you get a cheap node and it turns
out to be a "lemon" and the faraway dot com that sold it to you refuses
to fix or replace it.  The turnover from cheap to much more expensive
than just getting good nodes from a reputable vendor (which don't
usually cost THAT much more than cheap) can happen real fast, and the
time wasted can go from a few days to months equally fast.

One thing I haven't seen addressed is to look at the proposed usage of the cluster. If most of the code to be run on the cluster is embarrassingly parallel, then the cost of a node going down or the network being less than optimal is fairly low. In this case, IMO, it's pretty easy to make the argument to go the DIY route (depending on size and available labor pool, of course, as others have mentioned). If, OTOH, you intend to run tightly coupled MPI code across the entire cluster, then it becomes very valuable to ensure that everything is working together just so. There a turn-key vendor (and/or highly skilled third party) can make more sense.

In other words, the answer, as always, is "It depends."

--
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to