[jira] [Commented] (HBASE-13194) TableNamespaceManager not ready cause MasterQuotaManager initialization fail

zhangduo (JIRA) Wed, 11 Mar 2015 06:23:50 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-13194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356854#comment-14356854
 ]


zhangduo commented on HBASE-13194:
----------------------------------

Seems the problem is here.
{noformat}
2015-03-10 22:42:01,337 INFO  [MASTER_SERVER_OPERATIONS-hemera:48616-0] 
handler.ServerShutdownHandler(186): Mark regions in recovery for crashed server 
hemera.apache.org,36185,1426027305449 before assignment; regions=[{ENCODED => 
969aa3ccca0a77c1d68f296b93b2d064, NAME => 
'hbase:namespace,,1426027307874.969aa3ccca0a77c1d68f296b93b2d064.', STARTKEY => 
'', ENDKEY => ''}]
2015-03-10 22:42:01,338 DEBUG [MASTER_SERVER_OPERATIONS-hemera:48616-0] 
zookeeper.ZKUtil(745): master:48616-0x14c05d9d745000b, quorum=localhost:63193, 
baseZNode=/hbase Unable to get data of znode 
/hbase/recovering-regions/969aa3ccca0a77c1d68f296b93b2d064 because node does 
not exist (not an error)
2015-03-10 22:42:01,351 INFO  [hemera:48616.activeMasterManager] 
master.AssignmentManager(416): Joined the cluster in 69ms, failover=true
2015-03-10 22:42:01,360 DEBUG [MASTER_SERVER_OPERATIONS-hemera:48616-0] 
coordination.ZKSplitLogManagerCoordination(650): Marked 
969aa3ccca0a77c1d68f296b93b2d064 as recovering from 
hemera.apache.org,36185,1426027305449: 
/hbase/recovering-regions/969aa3ccca0a77c1d68f296b93b2d064/hemera.apache.org,36185,1426027305449
2015-03-10 22:42:01,360 DEBUG [MASTER_SERVER_OPERATIONS-hemera:48616-0] 
master.RegionStates(492): Adding to processed servers 
hemera.apache.org,36185,1426027305449
2015-03-10 22:42:01,360 INFO  [MASTER_SERVER_OPERATIONS-hemera:48616-0] 
master.RegionStates(1074): Transition {969aa3ccca0a77c1d68f296b93b2d064 
state=OPEN, ts=1426027321326, server=hemera.apache.org,36185,1426027305449} to 
{969aa3ccca0a77c1d68f296b93b2d064 state=OFFLINE, ts=1426027321360, 
server=hemera.apache.org,36185,1426027305449}
2015-03-10 22:42:01,361 INFO  [MASTER_SERVER_OPERATIONS-hemera:48616-0] 
master.RegionStateStore(207): Updating row 
hbase:namespace,,1426027307874.969aa3ccca0a77c1d68f296b93b2d064. with 
state=OFFLINE
2015-03-10 22:42:01,369 INFO  [MASTER_SERVER_OPERATIONS-hemera:48616-0] 
handler.ServerShutdownHandler(218): Reassigning 1 region(s) that 
hemera.apache.org,36185,1426027305449 was carrying (and 0 regions(s) that were 
opening on this server)
{noformat}
HMaster is also a RegionServer which carries system table regions. And when 
restarting, seems the system region state is OPEN until we begin to recover it? 
So we pass the check isTableAssigned check in TableNamespaceManager.start, but 
the following calls to isTableAvailableAndInitialized are all failed because we 
just begin to recover it and the region state is transited to OFFLINE.
Not sure why this happen, I think the state should not be OPEN when HMaster 
started. Will go on tomorrow.

> TableNamespaceManager not ready cause MasterQuotaManager initialization fail 
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-13194
>                 URL: https://issues.apache.org/jira/browse/HBASE-13194
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: zhangduo
>
> This cause TestNamespaceAuditor to fail.
> https://builds.apache.org/job/HBase-TRUNK/6237/testReport/junit/org.apache.hadoop.hbase.namespace/TestNamespaceAuditor/testRegionOperations/
> {noformat}
> 2015-03-10 22:42:01,372 ERROR [hemera:48616.activeMasterManager] 
> namespace.NamespaceStateManager(204): Error while update namespace state.
> java.io.IOException: Table Namespace Manager not ready yet, try again later
>       at 
> org.apache.hadoop.hbase.master.HMaster.checkNamespaceManagerReady(HMaster.java:1912)
>       at 
> org.apache.hadoop.hbase.master.HMaster.listNamespaceDescriptors(HMaster.java:2131)
>       at 
> org.apache.hadoop.hbase.namespace.NamespaceStateManager.initialize(NamespaceStateManager.java:188)
>       at 
> org.apache.hadoop.hbase.namespace.NamespaceStateManager.start(NamespaceStateManager.java:63)
>       at 
> org.apache.hadoop.hbase.namespace.NamespaceAuditor.start(NamespaceAuditor.java:57)
>       at 
> org.apache.hadoop.hbase.quotas.MasterQuotaManager.start(MasterQuotaManager.java:88)
>       at 
> org.apache.hadoop.hbase.master.HMaster.initQuotaManager(HMaster.java:902)
>       at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:756)
>       at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:161)
>       at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1455)
>       at java.lang.Thread.run(Thread.java:744)
> {noformat}
> The direct reason is that we do not have a retry here, if init fails then it 
> always fails. But I skimmed the code, seems there is no async init operations 
> when calling finishActiveMasterInitialization, so it is very strange. Need to 
> dig more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13194) TableNamespaceManager not ready cause MasterQuotaManager initialization fail

Reply via email to