So 1 region of usertable got lost ?

Can you pastebin master server log around the time you killed the region server 
?

Thanks

On Nov 28, 2013, at 2:13 AM, Andrea <[email protected]> wrote:

> Hi, I'm using HBase 0.94.12 above Hadoop 1.2.1 and I have one node for 
> zookeeper, one node for a Namenode/Hmaster and three Datanode/Regionservers. 
> All the machines are on Amazon EC2, instance m2.xlarge.
> 
> I set the replication at two, so I'm expecting if I kill a 
> HregionServer/Datanode (for example by killing all java processes), all the 
> regions on that node are recover on one of the other two alive 
> HRegionservers.
> 
> But when I kill the node, I lost the regions on it and, worst of all, if on 
> that node there is .META. or -ROOT- table, the entire cluster is not working 
> at all!
> 
> If it could be helpfull, I load 500000 of rows in 'usertable' table with 
> YCSB tool and these are the status 'simple' and /hadoop fsck /hbase output 
> before/after the kill of the node:
> 
> before:
> 
> hbase(main):001:0> status 'simple'
> 3 live servers
>    ip-10-235-11-139:60020 1385632293907
>        requestsPerSecond=0, numberOfOnlineRegions=1, usedHeapMB=57, 
> maxHeapMB=14983
>    ip-10-253-29-220:60020 1385632293955
>        requestsPerSecond=0, numberOfOnlineRegions=2, usedHeapMB=74, 
> maxHeapMB=14983
>    ip-10-253-29-249:60020 1385632294162
>        requestsPerSecond=0, numberOfOnlineRegions=1, usedHeapMB=1935, 
> maxHeapMB=14983
> 0 dead servers
> Aggregate load: 0, regions: 4
> 
> 
> FSCK started by ubuntu from /10.253.91.250 for path /hbase at Thu Nov 28 
> 09:57:20 UTC 2013
> ..................................Status: HEALTHY
> Total size:    2122147158 B
> Total dirs:    31
> Total files:    34 (Files currently being written: 3)
> Total blocks (validated):    59 (avg. block size 35968595 B) (Total open 
> file blocks (not validated): 2)
> Minimally replicated blocks:    59 (100.0 %)
> Over-replicated blocks:    0 (0.0 %)
> Under-replicated blocks:    0 (0.0 %)
> Mis-replicated blocks:        0 (0.0 %)
> Default replication factor:    2
> Average block replication:    2.0
> Corrupt blocks:        0
> Missing replicas:        0 (0.0 %)
> Number of data-nodes:        3
> Number of racks:        1
> FSCK ended at Thu Nov 28 09:57:20 UTC 2013 in 23 milliseconds
> 
> 
> The filesystem under path '/hbase' is HEALTHY
> 
> -------------------------------------------------------------------------
> -------------------------------------------------------------------------
> 
> and after (about 15 minutes):
> 
> hbase(main):001:0> status 'simple'
> 2 live servers
>    ip-10-235-11-139:60020 1385632293907
>        requestsPerSecond=0, numberOfOnlineRegions=1, usedHeapMB=63, 
> maxHeapMB=14983
>    ip-10-253-29-220:60020 1385632293955
>        requestsPerSecond=0, numberOfOnlineRegions=2, usedHeapMB=117, 
> maxHeapMB=14983
> 1 dead servers
>    ip-10-253-29-249,60020,1385632294162
> Aggregate load: 0, regions: 3
> 
> 
> FSCK started by ubuntu from /10.253.91.250 for path /hbase at Thu Nov 28 
> 10:13:29 UTC 2013
> ....................Status: HEALTHY
> Total size:    948168097 B
> Total dirs:    27
> Total files:    20 (Files currently being written: 3)
> Total blocks (validated):    29 (avg. block size 32695451 B) (Total open 
> file blocks (not validated): 2)
> Minimally replicated blocks:    29 (100.0 %)
> Over-replicated blocks:    0 (0.0 %)
> Under-replicated blocks:    0 (0.0 %)
> Mis-replicated blocks:        0 (0.0 %)
> Default replication factor:    2
> Average block replication:    2.0
> Corrupt blocks:        0
> Missing replicas:        0 (0.0 %)
> Number of data-nodes:        2
> Number of racks:        1
> FSCK ended at Thu Nov 28 10:13:29 UTC 2013 in 7 milliseconds
> 
> 
> The filesystem under path '/hbase' is HEALTHY
> 
> 
> I hope to have been clear and to provide sufficiently information, or I can 
> post the hbase-site.xml and hdfs-site.xml configuration.
> 
> Thank you for your help!
> 
> Andrea
> 

Reply via email to