Newly added regionserver is not severing requests

Thanasis Naskos Fri, 04 Oct 2013 02:08:51 -0700

I'm setting up a Hbase cluster on a cloud infrastructure.
HBase version: 0.94.11
Hadoop version: 1.0.4

Currently I have 4 nodes in my cluster (1 master, 3 regionservers) andI'm using YCSB (yahoo benchmarks) to create a table (500.000 rows) andsend requests (Asynchronous requests). Everything works fine with thissetup (as I'm monitoring the hole process with ganglia and I'm gettinglamda, throughput, latency combined with the YCSB's output), but theproblem occurs when I add a new regionserver on-the-fly as it doesn'tgetting any requests.


What "on-the-fly" means:

While the YCSB is sending request to the cluster, I'm adding newregionservers using python scripts.


Addition Process (while the cluster is serving requests):

1. I'm creating a new VM which will act as the new regionserver and
   configure every needed aspect (hbase, hadoop, /etc/host, connect to
   private network, etc)
2. Stoping **hbase** balancer
3. Configuring every node in the cluster with the new node's information
     * adding hostname to regioservers files
     * adding hostname to hadoop's slave file
     * adding hostname and IP to /etc/host file of every node
     * etc
4. Executing on the master node:
     * `hadoop/bin/start-dfs.sh`
     * `hadoop/bin/start-mapred.sh`
     * `hbase/bin/start-hbase.sh`
       (I've also tried to run `hbase start regionserver` on the newly
       added node and does exactly the same with the last command -
       starts the regionserver)
5. Once the newly added node is up and running I'm executing **hadoop**
   load balancer
6. When the hadoop load balancer stops I'm starting again the **hbase**
   load balancer

I'm connecting over ssh to the master node and check that the loadbalancers (hbase/hadoop) did their job as both the blocks and regionsare uniformly spread across all the regionservers/slaves including thenew one.But when I run status 'simple' in the hbase shell I see that the newregionservers are not getting any requests. (below is the output of thecommand after adding 2 new regionserver "okeanos-nodes-4/5")


|hbase(main):008:0> status 'simple'
5 live servers
    okeanos-nodes-1:60020 1380865800330
        requestsPerSecond=5379, numberOfOnlineRegions=4, usedHeapMB=175, 
maxHeapMB=3067
    okeanos-nodes-2:60020 1380865800738
        requestsPerSecond=5674, numberOfOnlineRegions=4, usedHeapMB=161, 
maxHeapMB=3067
    okeanos-nodes-5:60020 1380867725605
        requestsPerSecond=0, numberOfOnlineRegions=3, usedHeapMB=27, 
maxHeapMB=3067
    okeanos-nodes-3:60020 1380865800162
        requestsPerSecond=3871, numberOfOnlineRegions=5, usedHeapMB=162, 
maxHeapMB=3067
    okeanos-nodes-4:60020 1380866702216
        requestsPerSecond=0, numberOfOnlineRegions=3, usedHeapMB=29, 
maxHeapMB=3067
0 dead servers
Aggregate load: 14924, regions: 19|

The fact that they don't serve any requests is also evidenced by the CPUusage, in a serving regionserver is about 70% while in these 2regioservers is about 2%.

Below is the output of|hadoop dfsadmin -report|, as you can see theblock are evenly distributed (according to|hadoop balancer -threshold 2|).


|root@okeanos-nodes-master:~# /opt/hadoop-1.0.4/bin/hadoop dfsadmin -report
Configured Capacity: 105701683200 (98.44 GB)
Present Capacity: 86440648704 (80.5 GB)
DFS Remaining: 84188446720 (78.41 GB)
DFS Used: 2252201984 (2.1 GB)
DFS Used%: 2.61%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 5 (5 total, 0 dead)

Name: 10.0.0.11:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 309166080 (294.84 MB)
Non DFS Used: 3851579392 (3.59 GB)
DFS Remaining: 16979591168(15.81 GB)
DFS Used%: 1.46%
DFS Remaining%: 80.32%
Last contact: Fri Oct 04 11:30:31 EEST 2013


Name: 10.0.0.3:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 531652608 (507.02 MB)
Non DFS Used: 3852300288 (3.59 GB)
DFS Remaining: 16756383744(15.61 GB)
DFS Used%: 2.51%
DFS Remaining%: 79.26%
Last contact: Fri Oct 04 11:30:32 EEST 2013


Name: 10.0.0.5:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 502910976 (479.61 MB)
Non DFS Used: 3853029376 (3.59 GB)
DFS Remaining: 16784396288(15.63 GB)
DFS Used%: 2.38%
DFS Remaining%: 79.4%
Last contact: Fri Oct 04 11:30:32 EEST 2013


Name: 10.0.0.4:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 421974016 (402.43 MB)
Non DFS Used: 3852365824 (3.59 GB)
DFS Remaining: 16865996800(15.71 GB)
DFS Used%: 2%
DFS Remaining%: 79.78%
Last contact: Fri Oct 04 11:30:29 EEST 2013


Name: 10.0.0.10:50010
Decommission Status : Normal
Configured Capacity: 21140336640 (19.69 GB)
DFS Used: 486498304 (463.96 MB)
Non DFS Used: 3851759616 (3.59 GB)
DFS Remaining: 16802078720(15.65 GB)
DFS Used%: 2.3%
DFS Remaining%: 79.48%
Last contact: Fri Oct 04 11:30:29 EEST 2013|

I've tried stopping YCSB, restarting hbase master and restarting YCSBbut with no lack.. these 2 nodes don't serve any requests!

As there are many log and conf files, I have created a zip file withlogs and confs (both hbase and hadoop) of the master, a healthyregionserver serving requests and a regionserver not servingrequests.https://dl.dropboxusercontent.com/u/13480502/hbase_hadoop_logs__conf.zip


Thank you in advance!!

Newly added regionserver is not severing requests

Reply via email to