Hi Ted, Chris,

Much appreciate your quick reply. The reason why we are looking for smaller
capacity drives is because we are not anticipating a huge growth in data
footprint and also read somewhere that larger the capacity of the drive,
bigger the number of platters in them and that could affect drive
performance. But looks like you can get 1TB drives with only 2 platters.
Large capacity drives should be OK for us as long as they perform equally
well.

Also, the systems that we have can host up to 8 SATA drives in them. In that
case, would  backplanes offer additional advantages?

Any suggestions on 5400 vs. 7200 vs. 10000 RPM disks?  I guess 10K rpm disks
would be overkill comparing their perf/cost advantage?

Thanks for your inputs.

-Shrinivas

On Thu, Feb 10, 2011 at 2:48 PM, Chris Collins <[email protected]>wrote:

> Of late we have had serious issues with seagate drives in our hadoop
> cluster.  These were purchased over several purchasing cycles and pretty
> sure it wasnt just a single "bad batch".   Because of this we switched to
> buying 2TB hitachi drives which seem to of been considerably more reliable.
>
> Best
>
> C
> On Feb 10, 2011, at 12:43 PM, Ted Dunning wrote:
>
> > Get bigger disks.  Data only grows and having extra is always good.
> >
> > You can get 2TB drives for <$100 and 1TB for < $75.
> >
> > As far as transfer rates are concerned, any 3GB/s SATA drive is going to
> be
> > about the same (ish).  Seek times will vary a bit with rotation speed,
> but
> > with Hadoop, you will be doing long reads and writes.
> >
> > Your controller and backplane will have a MUCH bigger vote in getting
> > acceptable performance.  With only 4 or 5 drives, you don't have to worry
> > about super-duper backplane, but you can still kill performance with a
> lousy
> > controller.
> >
> > On Thu, Feb 10, 2011 at 12:26 PM, Shrinivas Joshi <[email protected]
> >wrote:
> >
> >> What would be a good hard drive for a 7 node cluster which is targeted
> to
> >> run a mix of IO and CPU intensive Hadoop workloads? We are looking for
> >> around 1 TB of storage on each node distributed amongst 4 or 5 disks. So
> >> either 250GB * 4 disks or 160GB * 5 disks. Also it should be less than
> 100$
> >> each ;)
> >>
> >> I looked at HDD benchmark comparisons on tomshardware, storagereview
> etc.
> >> Got overwhelmed with the # of benchmarks and different aspects of HDD
> >> performance.
> >>
> >> Appreciate your help on this.
> >>
> >> -Shrinivas
> >>
>
>
>

Reply via email to