Of late we have had serious issues with seagate drives in our hadoop cluster. These were purchased over several purchasing cycles and pretty sure it wasnt just a single "bad batch". Because of this we switched to buying 2TB hitachi drives which seem to of been considerably more reliable.
Best C On Feb 10, 2011, at 12:43 PM, Ted Dunning wrote: > Get bigger disks. Data only grows and having extra is always good. > > You can get 2TB drives for <$100 and 1TB for < $75. > > As far as transfer rates are concerned, any 3GB/s SATA drive is going to be > about the same (ish). Seek times will vary a bit with rotation speed, but > with Hadoop, you will be doing long reads and writes. > > Your controller and backplane will have a MUCH bigger vote in getting > acceptable performance. With only 4 or 5 drives, you don't have to worry > about super-duper backplane, but you can still kill performance with a lousy > controller. > > On Thu, Feb 10, 2011 at 12:26 PM, Shrinivas Joshi <[email protected]>wrote: > >> What would be a good hard drive for a 7 node cluster which is targeted to >> run a mix of IO and CPU intensive Hadoop workloads? We are looking for >> around 1 TB of storage on each node distributed amongst 4 or 5 disks. So >> either 250GB * 4 disks or 160GB * 5 disks. Also it should be less than 100$ >> each ;) >> >> I looked at HDD benchmark comparisons on tomshardware, storagereview etc. >> Got overwhelmed with the # of benchmarks and different aspects of HDD >> performance. >> >> Appreciate your help on this. >> >> -Shrinivas >>
