On Thu, May 28, 2009 at 6:02 AM, Steve Loughran <ste...@apache.org> wrote:
> That really depends on the work you are doing...the bytes in/out to CPU > work, and the size of any memory structures that are built up over the run. > > With 1 core per physical disk, you get the bandwidth of a single disk per > CPU; for some IO-intensive work you can make the case for two disks/CPU -one > in, one out, but then you are using more power, and if/when you want to add > more storage, you have to pull out the disks to stick in new ones. If you go > for more CPUs, you will probably need more RAM to go with it. > Just to throw a wrench in the works, Intel's Nehalem architecture takes DDR3 memory which are paired in 3's. So for a dual quad core rig, you can get either 6 x 2GB (12GB) or, 6 x 4GB (24GB) for an extra $500. That's a big step up in price for extra memory in a slave node. 12GB probably won't be enough, because the mid-range Nehalems support hyper-threading, so you actually get up to 16 threads running on a dual quad setup. > Then there is the question of where your electricity comes from, what the > limits for the room are, whether you are billed on power drawn or quoted PSU > draw, what the HVAC limits are, what the maximum allowed weight per rack is, > etc, etc. We're going to start with cabinets in a co-location. Most can provide 40amps per cabinet (with up to 80% load), so you could fit around 30 single-socket servers, or 15 dual-socket servers in a single rack. > > I'm a fan of low Joule work, though we don't have any benchmarks yet of the > power efficiency of different clusters; the number of MJ used to do a a > terasort. I'm debating doing some single-cpu tests for this on my laptop, as > the battery knows how much gets used up by some work. > > 4. In planning storage capacity, how much spare disk space should I take >> into account for 'scratch'? For now, I'm assuming 1x the input data >> size. >> > > That you should probably be able to determine on experimental work on > smaller datasets. Some maps can throw out a lot of data, most reduces do > actually reduce the final amount. > > > -Steve > > (Disclaimer: I'm not making any official recommendations for hardware here, > just making my opinions known. If you do want an official recommendation > from HP, talk to your reseller or account manager, someone will look at your > problem in more detail and make some suggestions. If you have any code/data > that could be shared for benchmarking, that would help validate those > suggestions) > >