Hi all,
We are evaluating Hbase to store some metadata information on a very large
scale. As of now, our architecture looks like this.
Machine 1:
Runs Client 1
Runs Region Server 1
Runs Data Node 1
Machine n:
Runs Client n
Runs Region Server n
Runs Data Node n
Now, say, we have only one Region for the data set at the moment and its maxing
out, and the region is in Region Server 1. If a flood of new requests come in
to Machine n, and it tries to store the data, will Region Server n store it
locally on its data node n, or will the requests be routed to Region Server 1
and a new region is created there after it splits?
The reason I ask is because I want to see if a Client can be made sticky to a
region server. That way, if a user with an id 1111 comes in, he will be sent to
Client 1 all the time, because we know Region Server 1 will have his region. We
will know that by using his id to figure that out upfront. Just trying to
minimize the latency further. ( Of course I understand that if nodes are down,
there will be ways to route the traffic to another host to handle the users
that fall in that bucket)
thanks in advance