If those are your actual specs, I would definitely go with 16 of the smaller ones. 128G heaps are not going to work well in a JVM, you're better off running with more nodes with a more common configuration.
-Todd On Mon, Jun 7, 2010 at 1:46 PM, Jean-Daniel Cryans <[email protected]>wrote: > It really depends on your usage pattern, but there's a balance wrt > cost VS hardware you must achieve. At StumbleUpon we run with 2xi7, > 24GB, 4x 1TB and it works like a charm. The only thing I would change > is maybe more disks/node but that's pretty much it. Some relevant > questions: > > - Do you have any mem-intensive jobs? If so, figure how many tasks > you'll run per node and make the RAM fit the load. > - Do you plan to serve data out of HBase or will you just use it for > MapReduce? Or will it be a mix (not recommended)? > > Also, keep in mind that losing 1 machine over 8 compared to 1 over 16 > drastically changes the performance of your system at the time of the > failure. > > About virtualization, it doesn't make sense. Also your disks should be in > JBOD. > > J-D > > On Wed, Jun 2, 2010 at 11:12 PM, Sean Bigdatafun > <[email protected]> wrote: > > I am thinking of the following problem lately. I started thinking of this > > problem in the following context. > > > > I have a predefined budget and I can either > > -- A) purchase 8 more powerful servers (4cpu x 4 cores/cpu + 128GB mem > + > > 16 x 1TB disk) or > > -- B) purchase 16 less powerful servers(2cpu x 4 cores/cpu + 64GB mem + > 8 > > x 1TB disk) > > NOTE: I am basically making up a half housepower scenario > > -- Let's say I am going to use 10Gbps network switch and each machine > has > > a 10Gbps network card > > > > In the above scenario, does A or B perform better or relatively same? -- > I > > guess this really depends on Hadoop's map/reduce's scheduler. > > > > And then I have a following question: does it make sense to virtualize a > > Hadoop datanode at all? (if the answer to above question is "relatively > > same", I'd say it does not make sense) > > > > Thanks, > > Sean > > > -- Todd Lipcon Software Engineer, Cloudera
