On Tue, 11 Aug 2009 at 9:50pm, Robert G. Brown wrote
In a nutshell, the "cost of going cheap" isn't linear, with or without student/cheap labor. For small clusters installed by somebody who knows what they are doing and e.g. operated and used by the owner or the owner's lab including students, operated by departmental sysadmins with cluster experience and enough warm bodies to have some opportunity cost labor handy -- sure, go cheap -- if a node or two is DOA or fails, so what? It takes you an extra day or two to get the cluster going, but most of that time is waiting for parts -- OC time is much smaller, and everybody has other things to do while waiting. But as clusters get larger, the marginal cost of the differential failure rate between cheap and expensive scales up badly and can easily exceed the OC labor pool's capacity, especially if by bad luck you get a cheap node and it turns out to be a "lemon" and the faraway dot com that sold it to you refuses to fix or replace it. The turnover from cheap to much more expensive than just getting good nodes from a reputable vendor (which don't usually cost THAT much more than cheap) can happen real fast, and the time wasted can go from a few days to months equally fast.
One thing I haven't seen addressed is to look at the proposed usage of the cluster. If most of the code to be run on the cluster is embarrassingly parallel, then the cost of a node going down or the network being less than optimal is fairly low. In this case, IMO, it's pretty easy to make the argument to go the DIY route (depending on size and available labor pool, of course, as others have mentioned). If, OTOH, you intend to run tightly coupled MPI code across the entire cluster, then it becomes very valuable to ensure that everything is working together just so. There a turn-key vendor (and/or highly skilled third party) can make more sense.
In other words, the answer, as always, is "It depends." -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
