On 21.10.2013 12:17, Samir Ahmic wrote:
Hi, Boris
Did you check RS logs ? There should be exception regarding why
assignment failed. Can you past that exception ?
Cheers :)
On Mon, Oct 21, 2013 at 9:53 AM, Boris Emelyanov <[email protected]
<mailto:[email protected]>> wrote:
>Boris, what does hbck say? > >We have had this issue a couple
times before. To fix it I had to stop the cluster, run offline
meta repair tool, >delete zk-store on each zk quorum node >Offline
Meta repair tool will not work if there are inconsistencies in
HBase - you better try hbase hbck >-fixAll first. > >Best regards,
>Vladimir Rodionov >Principal Platform Engineer >Carrier IQ,
www.carrieriq.com <http://www.carrieriq.com/>
>e-mail:vrodionov@...
<http://gmane.org/get-address.php?address=vrodionov%2dSvj7bELwklqcm8Fc2pXOzQ%40public.gmane.org>
Hbck says "0 inconsistencies detected".
I stopped hbase cluster, deleted zk-database on all quorum nodes, ran "hbase
org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair",
and got "INFO util.HBaseFsck: Success! .META. table rebuilt.".
After that, cluster continued crashing during auto-loadbalancing.
--
Best regards,
Boris Emelyanov.
Hi, Samir! Thank you for your answers!
Actually, as I could understand, the assignment did not fail.
Here are my logs (time may be slightly out of sync):
on master:
2013-10-21 12:27:51,541 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan
for
mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.
destination server is testhadoop-102.example.com,60020,1382339032897
2013-10-21 12:27:51,541 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing
plan for region
mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d
74f05b.;
plan=hri=mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.,
src=, dest=testhadoop-102.example.com,60020,1382339032897
2013-10-21 12:27:51,541 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Assigning region
mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.
to testhadoop-102.example.com,60020,1382339032897
2013-10-21 12:27:51,576 FATAL org.apache.hadoop.hbase.master.HMaster:
Master server abort: loaded coprocessors are: []
2013-10-21 12:27:51,577 FATAL org.apache.hadoop.hbase.master.HMaster:
Unexpected state :
mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.
state=PENDING_OPEN, ts=1382344071576,
server=testhadoop-102.example.com,60020,1382339032897 .. Cannot transit
it to OFFLINE.
java.lang.IllegalStateException: Unexpected state :
mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.
state=PENDING_OPEN, ts=1382344071576,
server=testhadoop-102.example.com,60020,1382339032897 .. Cannot transit
it to OFFLINE.
on affected regionserver:
2013-10-21 12:27:52,561 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Received close
region:
mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b..
Version of ZK closing node:0
2013-10-21 12:27:52,562 DEBUG
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler:
Processing close of
mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.
2013-10-21 12:27:52,563 DEBUG
org.apache.hadoop.hbase.regionserver.HRegion: Closing
mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.:
disabling compactions & flushes
2013-10-21 12:27:52,563 DEBUG
org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for
region
mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.
2013-10-21 12:27:52,564 INFO org.apache.hadoop.hbase.regionserver.Store:
Closed a
2013-10-21 12:27:52,564 INFO org.apache.hadoop.hbase.regionserver.Store:
Closed b
2013-10-21 12:27:52,565 INFO org.apache.hadoop.hbase.regionserver.Store:
Closed c
2013-10-21 12:27:52,566 INFO org.apache.hadoop.hbase.regionserver.Store:
Closed d
2013-10-21 12:27:52,566 INFO org.apache.hadoop.hbase.regionserver.Store:
Closed e
2013-10-21 12:27:52,567 INFO org.apache.hadoop.hbase.regionserver.Store:
Closed f
2013-10-21 12:27:52,567 INFO org.apache.hadoop.hbase.regionserver.Store:
Closed g
2013-10-21 12:27:52,567 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Closed
mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.
2013-10-21 12:27:52,567 DEBUG
org.apache.hadoop.hbase.zookeeper.ZKAssign:
regionserver:60020-0x241bc934d55039b Attempting to transition node
45e518b477eeac50872de5a73d74f05b from M_ZK_REGION_CLOSING to
RS_ZK_REGION_CLOSED
2013-10-21 12:27:52,600 DEBUG
org.apache.hadoop.hbase.zookeeper.ZKAssign:
regionserver:60020-0x241bc934d55039b Successfully transitioned node
45e518b477eeac50872de5a73d74f05b from M_ZK_REGION_CLOSING to
RS_ZK_REGION_CLOSED
2013-10-21 12:27:52,600 DEBUG
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: set
region closed state in zk successfully for region
mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.
sn name: testhadoop-102.example.com,60020,1382339032897
2013-10-21 12:27:52,600 DEBUG
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Closed
region
mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.
.
.
.
2013-10-21 12:27:53,626 DEBUG
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region
transitioned to opened in zookeeper: {NAME =>
'mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.',
STARTKEY => 'fd27d27d27d27d27d27d27d27d27d27d27d27d18', ENDKEY => '',
ENCODED => 45e518b477eeac50872de5a73d74f05b,}, server:
testhadoop-102.example.com,60020,1382339032897
2013-10-21 12:27:53,626 DEBUG
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened
mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.
on server:testhadoop-102.example.com,60020,1382339032897
2013-10-21 12:27:53,826 DEBUG
org.apache.hadoop.hbase.regionserver.HRegionServer: No master found; retry
2013-10-21 12:27:56,840 DEBUG
org.apache.hadoop.hbase.regionserver.HRegionServer: No master found; retry
--
Best regards,
Boris Emelyanov.