We're planning out our first Hbase cluster, and we'd like to get some feedback 
on our proposed hardware configuration. We're intending to use this cluster 
purely for Hbase; it will not generally be running MapReduce jobs, nor will we 
be using HDFS for other storage tasks. In addition, our projected total dataset 
size is <1 TB. Our workload is still unclear, but will likely be roughly 1:1 
read:write ratio, with cell sizes <1 KB and significant use of increment().

Here's our current front-runner:
2U, 2-socket, 12-core (with HyperThreading for 24 OS-visible threads), probably 
E5645 (2.4~2.67 GHz) or X5675 (3.06~3.46 GHz)
48 GB RAM
2x 300 GB 10k SAS in RAID-1 for OS
12x 600 GB 15k SAS as JBOD for DataNode

We are thinking of putting in 4 of these as DataNode/HRegionServer machines, 
with another pair minus the 600GB drives as head nodes. The motivation behind 
the high-end disks and capacious RAM is that we anticipate being I/O bound, but 
we're concerned that we may be overspending, and/or selling ourselves short on 
total capacity. Still, this is a long way from the "commodity hardware" mantra, 
and we're considering whether we should go with 7200 RPM drives for more 
capacity and lower cost. It's also a big unit of failure for when the 
unexpected happens and takes down a node.

What's the current thinking on disk vs. CPU for pure Hbase usage on modern 
hardware? How much disk can one core comfortably service? 1x 7200? 2x 7200? 2x 
15k?
Do we want to lean towards more, cheaper nodes? It would also give us more 
network throughput per disk, which would be nice to speed up re-replication on 
node failure.

One possibility is to use the same chassis, but leave it half-populated: 
1-socket, 6-core, 24 GB RAM, 6x data drives. The question of fast disks vs. big 
disks and how many still applies.

Another possibility is to go with 1U units with 4x 1TB drives each, although 
this would likely mean giving up on RAID-1 for the OS. These would probably be 
6-core E5645, with 24 GB RAM. We'd be able to get 10 or so of these. I'm 
concerned that 4 7200 RPM drives would not be able to keep a 6-core CPU fed, 
especially with OS load on one of the drives effectively reducing data spindles 
to ~3.5.

I expect that we won't really understand our workload until we have the cluster 
deployed and loaded, but we'd like to make our first pass more than a shot in 
the dark. Any feedback you may have is most appreciated.

-- 
Miles Spielberg

Reply via email to