Ryan Rawson wrote: > we are using dell 1950s, 8cpu 16gb ram, dual 1tb disk. you can get > machines in this range for in the $2k range. I run hbase on 1tb of > data on 20 of these. You can probably look at doing 15+ machines. > > The master machine doesnt do much work, but it has to be reliable. > Raid, dual power supply, etc. If it goes down, namenode takes your > entire system down. I run them on a standard node, but with some of > the dual power features enabled. The regionservers do way more, so in > theory you could have a smaller master, but not too small. Probably > best to stick to 1 node time, keep it cheap.
I'm actually surprised that in a production cluster with hardware like this that you'd want to make a strong (i.e. hardware) differentiation between your namenode / datanodes, job / task trackers, etc. They're all probably similar enough and the cost difference between a namenode 1950 and a datanode 1950 is probably only in the extra memory and redundancy on the name node; it might not be all that different at all, really. (It's possible that's what you're saying here and I'm reading it wrong, in which case disregard this.) Ideally, your namenode never goes down / away, but should it happen, you would have a lot to gain in being able to know that any machine could replace the namenode in terms of hardware capacity. If the fs image / edit logs are made available for recovery, you could recover much quicker than if you have to have a different hardware configuration for the namenode (again, using namenode as an example). I've worked in a large number of situations with hundreds of machines like this (1850s, 1950s, 2800 / 2900 series) and found that having a small number of hardware configurations to be a huge benefit to rapid replacement in HA situations. Of course, you're trading a bit of specialization for consistency and "swapability" and that's a choice that might not apply in all cases, although it paid off in mine. Just to clarify, this wasn't specifically an hbase / hadoop cluster, but the idea of limiting variability in a data center (I think) still applies here. Thanks! -- Eric Sammer [email protected] http://esammer.blogspot.com
