Hi Ninad,
I think the answer depends on the anticipated scale of the deployment.
For small clusters (up to a few racks, ~40 servers per rack) I don't think
there is any significant performance hit to separate storage and computation.
Presumably all servers will share the same large GigE switch -- or maybe a
redundant L2 pair via bonded interfaces for fail over -- or a few of them
stacked with high speed interconnects. This would relieve the storage nodes of
RAM and CPU burden related to the computational tasks as you are thinking,
providing more headroom in exchange for some quite modest performance penalty.
(However, if your computation load is high and therefore the nodes are
overburdened and are not stable, there is no alternative...) In the future this
consideration might change if DFS clients are given some capability to find
blocks on local disk via some optimized I/O path.
In a large cluster there might well be significant performance impact. In a
common deployment scenario, there are rack-local switched fabrics and another
switched fabric for uplinks from the racks. So, a rack would have a switched
GigE backplane or similar, but inter-rack connections might be single GigE
uplinks, a ~40-to-1 reduction in capacity worst case; or maybe 10 GigE uplinks,
a ~10-1 reduction. Therefore it would be desirable to distribute the
computation into the racks where the data is located. When a region is deployed
to a region server the underlying blocks on DFS are not immediately migrated,
but always after a compaction -- a rewrite -- the underlying blocks will be
available on rack local data nodes, according to my understanding of how DFS
places replicas upon write. So, after a split, daughter regions will have their
blocks appropriately located in a timely manner. For the rest I wonder if it
would be beneficial to consider scheduling
major compaction more frequently than the 24 hour default for datacenter scale
deployments, something like 8 hours, and you might also consider triggering a
major compaction on important tables after cluster (re)init. Region deployment
in a system in steady state should have relatively little churn so this will
have the effect of optimizing block placement for region store access.
Submitted for your consideration,
- Andy
________________________________
From: Ninad Raut <[email protected]>
To: hbase-user <[email protected]>
Cc: Ranjit Nair <[email protected]>
Sent: Thursday, May 14, 2009 2:56:04 AM
Subject: Keeping Compute Nodes seperate from the region server node-- pros and
cons
Hi,
I want to get a design perspective here as to what will be the advantages of
seperating region servers and compute node(to run mapreduce tasks)
Will seperating datanodes from computes node reduce the load on the servers
and avoid swapping problems?
Will this seperation make map reduce tasks less efficient , since we are
doing away with localization issues?
Regards,
Ninad