On 21.10.2013 12:17, Samir Ahmic wrote:
Hi, Boris

Did you check RS logs ? There should be exception regarding why assignment failed. Can you past that exception ?

Cheers :)


On Mon, Oct 21, 2013 at 9:53 AM, Boris Emelyanov <[email protected] <mailto:[email protected]>> wrote:

    >Boris, what does hbck say? > >We have had this issue a couple
    times before. To fix it I had to stop the cluster, run offline
    meta repair tool, >delete zk-store on each zk quorum node >Offline
    Meta repair tool will not work if there are inconsistencies in
    HBase - you better try hbase hbck >-fixAll first. > >Best regards,
    >Vladimir Rodionov >Principal Platform Engineer >Carrier IQ,
    www.carrieriq.com <http://www.carrieriq.com/>

    >e-mail:vrodionov@...  
<http://gmane.org/get-address.php?address=vrodionov%2dSvj7bELwklqcm8Fc2pXOzQ%40public.gmane.org>

    Hbck says "0 inconsistencies detected".
    I stopped hbase cluster, deleted zk-database on all quorum nodes, ran "hbase 
org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair",
    and got "INFO util.HBaseFsck: Success! .META. table rebuilt.".
    After that, cluster continued crashing during auto-loadbalancing.


-- Best regards,

    Boris Emelyanov.


Hi, Samir! Thank you for your answers!

Actually, as I could understand, the assignment did not fail.
Here are my logs (time may be slightly out of sync):

on master:

2013-10-21 12:27:51,541 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.
destination server is testhadoop-102.example.com,60020,1382339032897
2013-10-21 12:27:51,541 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for region mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d 74f05b.; plan=hri=mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b., src=, dest=testhadoop-102.example.com,60020,1382339032897 2013-10-21 12:27:51,541 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b. to testhadoop-102.example.com,60020,1382339032897 2013-10-21 12:27:51,576 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: [] 2013-10-21 12:27:51,577 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected state : mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b. state=PENDING_OPEN, ts=1382344071576, server=testhadoop-102.example.com,60020,1382339032897 .. Cannot transit it to OFFLINE. java.lang.IllegalStateException: Unexpected state : mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b. state=PENDING_OPEN, ts=1382344071576, server=testhadoop-102.example.com,60020,1382339032897 .. Cannot transit it to OFFLINE.

on affected regionserver:

2013-10-21 12:27:52,561 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received close region: mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.. Version of ZK closing node:0 2013-10-21 12:27:52,562 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing close of mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b. 2013-10-21 12:27:52,563 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.: disabling compactions & flushes 2013-10-21 12:27:52,563 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b. 2013-10-21 12:27:52,564 INFO org.apache.hadoop.hbase.regionserver.Store: Closed a 2013-10-21 12:27:52,564 INFO org.apache.hadoop.hbase.regionserver.Store: Closed b 2013-10-21 12:27:52,565 INFO org.apache.hadoop.hbase.regionserver.Store: Closed c 2013-10-21 12:27:52,566 INFO org.apache.hadoop.hbase.regionserver.Store: Closed d 2013-10-21 12:27:52,566 INFO org.apache.hadoop.hbase.regionserver.Store: Closed e 2013-10-21 12:27:52,567 INFO org.apache.hadoop.hbase.regionserver.Store: Closed f 2013-10-21 12:27:52,567 INFO org.apache.hadoop.hbase.regionserver.Store: Closed g 2013-10-21 12:27:52,567 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b. 2013-10-21 12:27:52,567 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x241bc934d55039b Attempting to transition node 45e518b477eeac50872de5a73d74f05b from M_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED 2013-10-21 12:27:52,600 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x241bc934d55039b Successfully transitioned node 45e518b477eeac50872de5a73d74f05b from M_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED 2013-10-21 12:27:52,600 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: set region closed state in zk successfully for region mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b. sn name: testhadoop-102.example.com,60020,1382339032897 2013-10-21 12:27:52,600 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Closed region mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.
.
.
.
2013-10-21 12:27:53,626 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME => 'mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b.', STARTKEY => 'fd27d27d27d27d27d27d27d27d27d27d27d27d18', ENDKEY => '', ENCODED => 45e518b477eeac50872de5a73d74f05b,}, server: testhadoop-102.example.com,60020,1382339032897 2013-10-21 12:27:53,626 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened mytable,fd27d27d27d27d27d27d27d27d27d27d27d27d18,1380545986996.45e518b477eeac50872de5a73d74f05b. on server:testhadoop-102.example.com,60020,1382339032897 2013-10-21 12:27:53,826 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: No master found; retry 2013-10-21 12:27:56,840 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: No master found; retry


--
Best regards,

Boris Emelyanov.

Reply via email to