We're planning out our first Hbase cluster, and we'd like to get some
feedback on our proposed hardware configuration. We're intending to use this
cluster purely for Hbase; it will not generally be running MapReduce jobs,
nor will we be using HDFS for other storage tasks. In addition, our
projected total dataset size is <1 TB. Our workload is still unclear, but
will likely be roughly 1:1 read:write ratio, with cell sizes <1 KB and
significant use of increment().

Here's our current front-runner:
2U, 2-socket, 12-core (with HyperThreading for 24 OS-visible threads),
probably E5645 (2.4~2.67 GHz) or X5675 (3.06~3.46 GHz)
48 GB RAM
2x 300 GB 10k SAS in RAID-1 for OS
12x 600 GB 15k SAS as JBOD for DataNode

We are thinking of putting in 4 of these as DataNode/HRegionServer machines,
with another pair minus the 600GB drives as head nodes. The motivation
behind the high-end disks and capacious RAM is that we anticipate being I/O
bound, but we're concerned that we may be overspending, and/or selling
ourselves short on total capacity. Still, this is a long way from the
"commodity hardware" mantra, and we're considering whether we should go with
7200 RPM drives for more capacity and lower cost. It's also a big unit of
failure for when the unexpected happens and takes down a node.

What's the current thinking on disk vs. CPU for pure Hbase usage on modern
hardware? How much disk can one core comfortably service? 1x 7200? 2x 7200?
2x 15k?
Do we want to lean towards more, cheaper nodes? It would also give us more
network throughput per disk, which would be nice to speed up re-replication
on node failure.

One possibility is to use the same chassis, but leave it half-populated:
1-socket, 6-core, 24 GB RAM, 6x data drives. The question of fast disks vs.
big disks and how many still applies.

Another possibility is to go with 1U units with 4x 1TB drives each, although
this would likely mean giving up on RAID-1 for the OS. These would probably
be 6-core E5645, with 24 GB RAM. We'd be able to get 10 or so of these. I'm
concerned that 4 7200 RPM drives would not be able to keep a 6-core CPU fed,
especially with OS load on one of the drives effectively reducing data
spindles to ~3.5.

I expect that we won't really understand our workload until we have the
cluster deployed and loaded, but we'd like to make our first pass more than
a shot in the dark. Any feedback you may have is most appreciated.

--
Miles Spielberg

Reply via email to