Get bigger disks. Data only grows and having extra is always good. You can get 2TB drives for <$100 and 1TB for < $75.
As far as transfer rates are concerned, any 3GB/s SATA drive is going to be about the same (ish). Seek times will vary a bit with rotation speed, but with Hadoop, you will be doing long reads and writes. Your controller and backplane will have a MUCH bigger vote in getting acceptable performance. With only 4 or 5 drives, you don't have to worry about super-duper backplane, but you can still kill performance with a lousy controller. On Thu, Feb 10, 2011 at 12:26 PM, Shrinivas Joshi <[email protected]>wrote: > What would be a good hard drive for a 7 node cluster which is targeted to > run a mix of IO and CPU intensive Hadoop workloads? We are looking for > around 1 TB of storage on each node distributed amongst 4 or 5 disks. So > either 250GB * 4 disks or 160GB * 5 disks. Also it should be less than 100$ > each ;) > > I looked at HDD benchmark comparisons on tomshardware, storagereview etc. > Got overwhelmed with the # of benchmarks and different aspects of HDD > performance. > > Appreciate your help on this. > > -Shrinivas >
