We're planning out our first Hbase cluster, and we'd like to get some feedback on our proposed hardware configuration. We're intending to use this cluster purely for Hbase; it will not generally be running MapReduce jobs, nor will we be using HDFS for other storage tasks. In addition, our projected total dataset size is <1 TB. Our workload is still unclear, but will likely be roughly 1:1 read:write ratio, with cell sizes <1 KB and significant use of increment().
Here's our current front-runner: 2U, 2-socket, 12-core (with HyperThreading for 24 OS-visible threads), probably E5645 (2.4~2.67 GHz) or X5675 (3.06~3.46 GHz) 48 GB RAM 2x 300 GB 10k SAS in RAID-1 for OS 12x 600 GB 15k SAS as JBOD for DataNode We are thinking of putting in 4 of these as DataNode/HRegionServer machines, with another pair minus the 600GB drives as head nodes. The motivation behind the high-end disks and capacious RAM is that we anticipate being I/O bound, but we're concerned that we may be overspending, and/or selling ourselves short on total capacity. Still, this is a long way from the "commodity hardware" mantra, and we're considering whether we should go with 7200 RPM drives for more capacity and lower cost. It's also a big unit of failure for when the unexpected happens and takes down a node. What's the current thinking on disk vs. CPU for pure Hbase usage on modern hardware? How much disk can one core comfortably service? 1x 7200? 2x 7200? 2x 15k? Do we want to lean towards more, cheaper nodes? It would also give us more network throughput per disk, which would be nice to speed up re-replication on node failure. One possibility is to use the same chassis, but leave it half-populated: 1-socket, 6-core, 24 GB RAM, 6x data drives. The question of fast disks vs. big disks and how many still applies. Another possibility is to go with 1U units with 4x 1TB drives each, although this would likely mean giving up on RAID-1 for the OS. These would probably be 6-core E5645, with 24 GB RAM. We'd be able to get 10 or so of these. I'm concerned that 4 7200 RPM drives would not be able to keep a 6-core CPU fed, especially with OS load on one of the drives effectively reducing data spindles to ~3.5. I expect that we won't really understand our workload until we have the cluster deployed and loaded, but we'd like to make our first pass more than a shot in the dark. Any feedback you may have is most appreciated. -- Miles Spielberg
