If you are going to be buying new machines, I think the following blurb from jcole's blog says it best (for datanodes at least: namenodes are a whole different ballgame):
> What does it mean for the machine to be “commodity”? It means that the > components are standardized, common, and the price is set by the market, > not by a single corporation. Use commodity machines configured with a good > balance of price vs. performance. http://jcole.us/blog/archives/2007/06/10/scaling-out-and-up-a-compromise/ Thanks, Stu -----Original Message----- From: Ted Dunning <[EMAIL PROTECTED]> Sent: Wednesday, November 7, 2007 4:20pm To: [email protected] Subject: Re: commodity vs. high perf machines: which would you rather My mileage computation came out essentially the same as what Doug says. I use some cast-off machines that were not reliable enough for other applications. They originally cost us about 2/3 what our normal production boxes cots and achieve almost exactly 1/2 as much. Our production boxes are typically dual CPU's with dual cores. On 11/7/07 12:27 PM, "Doug Cutting" <[EMAIL PROTECTED]> wrote: > So I'd hazard that moderately high-end commodity hardware is the most > cost-effective for Hadoop today. YMMV.
