Hello everybody
I've run into this strange problem. We run a 6 RS cluster and suddenly the client application started reporting errors, region not online. In the web console all regionserver appeared up. I've run hbck and got strange results

Number of Tables: 2
Number of live region servers: 6
Number of dead region servers: 12

Cluster was in inconsistent state. With hbase shell status 'detailed' I got the dead machines

12 dead servers
    search-hadoop-eu006.v300.gmx.net,60020,1305025929461
    search-hadoop-eu002.v300.gmx.net,60020,1305019508570
    search-hadoop-eu004.v300.gmx.net,60020,1305019551236
    search-hadoop-eu003.v300.gmx.net,60020,1305025688666
    search-hadoop-eu005.v300.gmx.net,60020,1305025841017
    search-hadoop-eu006.v300.gmx.net,60020,1306156842070
    search-hadoop-eu005.v300.gmx.net,60020,1305019568146
    search-hadoop-eu001.v300.gmx.net,60020,1305025543786
    search-hadoop-eu004.v300.gmx.net,60020,1305025761173
    search-hadoop-eu002.v300.gmx.net,60020,1305025611163
    search-hadoop-eu006.v300.gmx.net,60020,1305019572576
    search-hadoop-eu003.v300.gmx.net,60020,1305019547053


It appears that all live regionserver are listed as dead also. I tried hbck -fix and the cluster is now in Ok state but still reports 12 machines dead as above.
I've checked the logs but nothing obvious. Any idea? We use CDH3u0.


Thanks
Daniel



Reply via email to