Okay I read the logs again and we're in a weird failure state. 1) Master comes up 2) Master schedules SCP for all RS 3) Master recovers meta 4) SCP for every server claims AM currently thinks 0 regions were assigned to each server. 5) Master successfully finishes WAL splitting from dead RS and works through prior split attempts that died? 6) WAL recovery from every RS says there are no edits for any region 7) No Assignments are scheduled out of the SCP because each believes there were no regions hosted on the server that's being processed 6) Master reports all SCP have completed successfully 7) Master times out at initializing
Could you link to a scan of meta? it'll include server names, table names, and region information, so I'm not sure if any of those are too sensitive? On Thu, Mar 14, 2019 at 11:36 AM Jean-Marc Spaggiari <[email protected]> wrote: > > Updated logs are there: https://pastebin.com/1UrTA8JS > > They really look like exactly the same as the previous version :-/ > > There is no warning, no error, nothing :( > > JMS > > Le jeu. 14 mars 2019 à 13:38, Sean Busbey <[email protected]> a écrit : > > > We still need to find out why hbase:namespace is not online. Did the > > logs complaining about being unable to assign regions not include any > > thing about the region(s) for the namespace table? > > > > Can you upload updated logs? > > > > If there's no mention of it then that sounds like we need an hbck2 > > command to output the current assignment state of a region. > > > > On Thu, Mar 14, 2019 at 11:57 AM Jean-Marc Spaggiari > > <[email protected]> wrote: > > > > > > I stopped all the region servers, started the master. It was complaining > > > about not being able to assign regions. Then started region servers, but > > > after 5 minutes got the same error :-/ > > > > > > 2019-03-14 12:46:38,586 ERROR [master/node2:60000:becomeActiveMaster] > > > master.HMaster: Failed to become active master > > > java.lang.IllegalStateException: Expected the service > > > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has > > FAILED > > > at > > > > > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345) > > > at > > > > > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291) > > > at > > > > > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1341) > > > at > > > > > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1119) > > > at > > > > > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2347) > > > at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:595) > > > at java.lang.Thread.run(Thread.java:748) > > > Caused by: java.io.IOException: Timedout 300000ms waiting for namespace > > > table to be assigned and enabled: tableName=hbase:namespace, > > state=ENABLED > > > at > > > > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:108) > > > at > > > > > org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63) > > > at > > > > > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226) > > > at > > > > > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1339) > > > ... 4 more > > > > > > Then stopped all, configured the maintenance mode, started all, get the > > > same error. I tried to bounce the RS within those 5 minutes without any > > > difference. I still get the same exception after 5 minutes: > > > 2019-03-14 12:55:35,167 ERROR [master/node2:60000:becomeActiveMaster] > > > master.HMaster: ***** ABORTING master node2.distparser.com > > ,60000,1552582220013: > > > Unhandled exception. Starting shutdown. ***** > > > java.lang.IllegalStateException: Expected the service > > > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has > > FAILED > > > at > > > > > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345) > > > at > > > > > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291) > > > at > > > > > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1341) > > > at > > > > > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1119) > > > at > > > > > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2347) > > > at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:595) > > > at java.lang.Thread.run(Thread.java:748) > > > Caused by: java.io.IOException: Timedout 300000ms waiting for namespace > > > table to be assigned and enabled: tableName=hbase:namespace, > > state=ENABLED > > > at > > > > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:108) > > > at > > > > > org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63) > > > at > > > > > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226) > > > at > > > > > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1339) > > > ... 4 more > > > > > > I validated the maintenance mode: > > > 2019-03-14 12:50:25,421 INFO [main] master.HMaster: Detected > > > hbase.master.maintenance_mode=true via configuration. > > > > > > And now removed it. > > > > > > What can I try next? > > > > > > I know I can always rename /hbase to /hbase_old and bulkload the HFiles > > > back to the table, that's not a big deal, but I'm curious to see if we > > can > > > get that working... > > > > > > > > > > > > BTW, when we try to access the Master webUI while it starts we are > > getting > > > an exception: > > > 2019-03-14 12:50:40,782 WARN [qtp196717412-78] servlet.ServletHandler: > > > /master-status > > > java.lang.NullPointerException > > > at > > > > > org.apache.hadoop.hbase.tmpl.master.MasterStatusTmplImpl.renderNoFlush(MasterStatusTmplImpl.java:326) > > > at > > > > > org.apache.hadoop.hbase.tmpl.master.MasterStatusTmpl.renderNoFlush(MasterStatusTmpl.java:397) > > > at > > > > > org.apache.hadoop.hbase.tmpl.master.MasterStatusTmpl.render(MasterStatusTmpl.java:388) > > > at > > > > > org.apache.hadoop.hbase.master.MasterStatusServlet.doGet(MasterStatusServlet.java:79) > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) > > > at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) > > > at > > > > > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1780) > > > at > > > > > org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:112) > > > at > > > > > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > > > at > > > > > org.apache.hadoop.hbase.http.ClickjackingPreventionFilter.doFilter(ClickjackingPreventionFilter.java:48) > > > > > > > > > Le jeu. 14 mars 2019 à 08:07, Wellington Chevreuil < > > > [email protected]> a écrit : > > > > > > > Yeah, as I suspected in my previous comment, for this type of > > timeouts, the > > > > maintenance mode wouldn't give any help. It's weird that AM starts but > > > > apparently does nothing until the namespace 5 mins timeout is reached: > > > > ... > > > > 2019-03-12 20:53:45,942 INFO [master/node2:60000:becomeActiveMaster] > > > > assignment.AssignmentManager: Joined the cluster in 308msec > > > > 2019-03-12 20:54:45,725 INFO > > > > [ReadOnlyZKClient-latitude.distparser.com:2181@0x7ea9b2c0] > > > > zookeeper.ZooKeeper: Session: 0x16911bd542a02a2 closed > > > > 2019-03-12 20:54:45,725 INFO > > > > [ReadOnlyZKClient-latitude.distparser.com:2181@0x7ea9b2c0-EventThread] > > > > zookeeper.ClientCnxn: EventThread shut down for session: > > 0x16911bd542a02a2 > > > > 2019-03-12 20:58:46,603 ERROR [master/node2:60000:becomeActiveMaster] > > > > master.HMaster: Failed to become active master > > > > java.lang.IllegalStateException: Expected the service > > > > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has > > FAILED > > > > ... > > > > > > > > I would expect namespace region to be processed by this call > > > > < > > > > > > https://github.com/apache/hbase/blob/2.2.0-RC0/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java#L1088 > > > > >, > > > > if > > > > namespace region is offline, as the timeout suggests. Also odd is that > > we > > > > don;'t see any logs suggesting offlined regions are getting assigned. > > Maybe > > > > all regions are already online on RSes? But then master should had > > figured > > > > that out. Have you already tried restart all RSes? That could kick some > > > > reassignments. > > > > > > > > Em qui, 14 de mar de 2019 às 02:21, Jean-Marc Spaggiari < > > > > [email protected]> escreveu: > > > > > > > > > Hi Wellington, > > > > > > > > > > Indeed, the META is now deployed. I found the namespace region > > encoded > > > > > name using hdfs dfs -ls -R /hbase/data/hbase/namespace and it gives > > > > > me 7f4a480f47f98300185d1ae2ff663295. But here again, HBCK doesn't > > want to > > > > > do anything because the master is initializing :( I tried with ad > > without > > > > > the maintenant flag and I get the same result. > > > > > > > > > > On HBCK2 side: PleaseHoldException: Master is initializing > > > > > On the master side, it just stoped after 5 minutes trying to assign > > > > > namespace :( > > > > > > > > > > JMS > > > > > > > > > > > > > > > Le mer. 13 mars 2019 à 12:04, Wellington Chevreuil < > > > > > [email protected]> a écrit : > > > > > > > > > > > "1588230740" would be the meta region name, not namespace. It seems > > > > meta > > > > > is > > > > > > already online, per below log: > > > > > > ... > > > > > > 2019-03-12 20:53:41,037 INFO > > [master/node2:60000:becomeActiveMaster] > > > > > > master.HMaster: hbase:meta {1588230740 state=OPEN, > > ts=1552438420570, > > > > > > server= > > > > > > node7.distparser.com,16020,1552421510124} > > > > > > ... > > > > > > > > > > > > The maintenance mode I suggested before was to have master doing > > > > minimum > > > > > > required stuff while attempting to getting meta/namespace online, > > but I > > > > > > guess it wouldn't be able to avoid such timeouts. Below message > > also > > > > > means > > > > > > AM could read meta table, giving another indication meta is fine: > > > > > > ... > > > > > > 2019-03-12 20:53:45,942 INFO > > [master/node2:60000:becomeActiveMaster] > > > > > > assignment.AssignmentManager: Joined the cluster in 308msec > > > > > > ... > > > > > > > > > > > > Now issue is namespace table. For some reason, AM is not able to > > kick > > > > APs > > > > > > before the 5 minutes timeout exceeds, and that's probably why > > namespace > > > > > > table never comes available: > > > > > > ... > > > > > > 2019-03-12 20:53:45,942 INFO > > [master/node2:60000:becomeActiveMaster] > > > > > > assignment.AssignmentManager: Joined the cluster in 308msec > > > > > > 2019-03-12 20:54:45,725 INFO > > > > > > [ReadOnlyZKClient-latitude.distparser.com:2181@0x7ea9b2c0] > > > > > > zookeeper.ZooKeeper: Session: 0x16911bd542a02a2 closed > > > > > > 2019-03-12 20:54:45,725 INFO > > > > > > [ReadOnlyZKClient-latitude.distparser.com:2181 > > @0x7ea9b2c0-EventThread] > > > > > > zookeeper.ClientCnxn: EventThread shut down for session: > > > > > 0x16911bd542a02a2 > > > > > > 2019-03-12 20:58:46,603 ERROR > > [master/node2:60000:becomeActiveMaster] > > > > > > master.HMaster: Failed to become active master > > > > > > java.lang.IllegalStateException: Expected the service > > > > > > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service > > has > > > > > FAILED > > > > > > ... > > > > > > > > > > > > You may be able to force namespace region coming online with hbck2 > > > > > assigns > > > > > > command. You would need to find out the namespace region name > > first, > > > > you > > > > > > can either scan meta table or check the region dir name in hdfs > > with > > > > > "hdfs > > > > > > dfs -ls -R /hbase | grep namespace", in order to pass it as a > > param for > > > > > > > > > > > > Em qua, 13 de mar de 2019 às 13:00, Jean-Marc Spaggiari < > > > > > > [email protected]> escreveu: > > > > > > > > > > > > > Hi Sean, > > > > > > > > > > > > > > I tried. I looked-up the region name for base:namespace like > > this: > > > > > > > > > > > > > > hdfs dfs -ls /hbase/data/hbase/meta/ > > > > > > > > > > > > > > And found the region to be 1588230740. > > > > > > > > > > > > > > The master dies after 5 minutes, so I start the master, wait 2 > > > > minutes > > > > > to > > > > > > > be sure it's up, and run the following command: > > > > > > > > > > > > > > bin/hbase hbck -j > > > > > > > > > > > > > > > > > > > > > > > > test/hbase-operator-tools/hbase-hbck2/target/hbase-hbck2-1.0.0-SNAPSHOT.jar > > > > > > > assigns 1588230740 > > > > > > > > > > > > > > But HBCK2 doesn't like it: > > > > > > > 08:57:35.273 [main] INFO > > > > > > > org.apache.hadoop.hbase.client.RpcRetryingCallerImpl - Call > > > > exception, > > > > > > > tries=9, retries=16, started=29322 ms ago, cancelled=false, > > > > > > > msg=org.apache.hadoop.hbase.PleaseHoldException: Master is > > > > initializing > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:3057) > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.master.MasterRpcServices.getClusterStatus(MasterRpcServices.java:942) > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) > > > > > > > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413) > > > > > > > at > > org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130) > > > > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > > > > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > > > > > > > > > > > > > > > > > > > > > It keeps retrying and after 16 times it stopped saying the > > master is > > > > > not > > > > > > > initialized. > > > > > > > > > > > > > > On the WebUI I can see that there is a single region assigned, > > the > > > > META > > > > > > > region. > > > > > > > > > > > > > > Also, here is the HDFS structure of my META table. Sounds like > > some > > > > > parts > > > > > > > got lost in the process (The info content). > > > > > > > > > > > > > > hbase@node2:~/hbase-2.2.0$ hdfs dfs -ls -R > > /hbase/data/hbase/meta/ > > > > > > > drwxr-xr-x - hbase supergroup 0 2019-03-12 15:42 > > > > > > > /hbase/data/hbase/meta/.tabledesc > > > > > > > -rw-r--r-- 3 hbase supergroup 1447 2019-03-12 15:42 > > > > > > > /hbase/data/hbase/meta/.tabledesc/.tableinfo.0000000001 > > > > > > > drwxr-xr-x - hbase supergroup 0 2019-03-12 15:42 > > > > > > > /hbase/data/hbase/meta/.tmp > > > > > > > drwxr-xr-x - hbase supergroup 0 2019-03-12 15:49 > > > > > > > /hbase/data/hbase/meta/1588230740 > > > > > > > -rw-r--r-- 3 hbase supergroup 32 2019-03-12 15:40 > > > > > > > /hbase/data/hbase/meta/1588230740/.regioninfo > > > > > > > drwxr-xr-x - hbase supergroup 0 2019-03-12 15:40 > > > > > > > /hbase/data/hbase/meta/1588230740/info > > > > > > > drwxr-xr-x - hbase supergroup 0 2019-03-12 15:40 > > > > > > > /hbase/data/hbase/meta/1588230740/recovered.edits > > > > > > > -rw-r--r-- 3 hbase supergroup 0 2019-03-12 15:40 > > > > > > > /hbase/data/hbase/meta/1588230740/recovered.edits/2.seqid > > > > > > > drwxr-xr-x - hbase supergroup 0 2019-03-12 15:42 > > > > > > > /hbase/data/hbase/meta/1588230740/rep_barrier > > > > > > > drwxr-xr-x - hbase supergroup 0 2019-03-12 15:47 > > > > > > > /hbase/data/hbase/meta/1588230740/table > > > > > > > -rw-r--r-- 3 hbase supergroup 5454 2019-03-12 15:47 > > > > > > > > > > > > > > /hbase/data/hbase/meta/1588230740/table/b65e8774ff284e77bf22641de36110cc > > > > > > > > > > > > > > What will be the next best step? > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > JMS > > > > > > > > > > > > > > > > > > > > > > > > > > > > Le mer. 13 mars 2019 à 08:45, Sean Busbey <[email protected]> a > > > > écrit > > > > > : > > > > > > > > > > > > > > > Okay so master thinks hbase:namespace is already enabled, but > > no RS > > > > > > > > believes it should be hosting the regions. > > > > > > > > > > > > > > > > Can you find the region name for the hbase:namespace region and > > > > issue > > > > > > > > an hbck2 assigns command for it? > > > > > > > > > > > > > > > > On Tue, Mar 12, 2019 at 8:26 PM Jean-Marc Spaggiari > > > > > > > > <[email protected]> wrote: > > > > > > > > > > > > > > > > > > It doesn't say that much :( > > > > > > > > > > > > > > > > > > hbase@node2:~/hbase-2.2.0$ cat > > logs/hbase-hbase-master-node2.log > > > > > | > > > > > > > > grep -i > > > > > > > > > namespace > > > > > > > > > Caused by: java.io.IOException: Timedout 300000ms waiting for > > > > > > namespace > > > > > > > > > table to be assigned and enabled: tableName=hbase:namespace, > > > > > > > > state=ENABLED > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:108) > > > > > > > > > Caused by: java.io.IOException: Timedout 300000ms waiting for > > > > > > namespace > > > > > > > > > table to be assigned and enabled: tableName=hbase:namespace, > > > > > > > > state=ENABLED > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:108) > > > > > > > > > > > > > > > > > > I cleared the logs before restarting the instance. That all > > what > > > > it > > > > > > > says > > > > > > > > > about namespace. > > > > > > > > > > > > > > > > > > Full logs are available there: https://pastebin.com/9j2Rzdcg > > > > > > > > > > > > > > > > > > Le mar. 12 mars 2019 à 20:47, Sean Busbey < > > [email protected] > > > > > > > > > > a > > > > > > > > écrit : > > > > > > > > > > > > > > > > > > > okay so the master spent ~5 minutes waiting to see if it > > could > > > > > get > > > > > > > the > > > > > > > > > > namespace table working. when it couldn't it aborted. > > > > > > > > > > > > > > > > > > > > can you look back over that 5 minutes and see what the > > master > > > > had > > > > > > to > > > > > > > > > > say about the namespace table? did the master think some > > > > > particular > > > > > > > > > > server should have it open already? was it waiting for > > someone > > > > to > > > > > > > > > > finish opening or closing it? > > > > > > > > > > > > > > > > > > > > On Tue, Mar 12, 2019 at 6:39 PM Jean-Marc Spaggiari > > > > > > > > > > <[email protected]> wrote: > > > > > > > > > > > > > > > > > > > > > > Le mar. 12 mars 2019 à 19:25, Sean Busbey < > > [email protected] > > > > > > > > > > a > > > > > > > > écrit : > > > > > > > > > > > > > > > > > > > > > > > your command above points at the wrong jar from the > > hbck2 > > > > > repo. > > > > > > > > it's > > > > > > > > > > > > > > > > > > > > > > > pointing at the one where you need to manually assemble > > all > > > > the > > > > > > > > > > > > dependencies it has. > > > > > > > > > > > > > > > > > > > > > > > > You want the one that does not say "original" in the > > name. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Ha!!! That's why! Way easier ;) > > > > > > > > > > > > > > > > > > > > > > indeed, this works even without removing all environment > > > > > > variables: > > > > > > > > > > > bin/hbase hbck -j > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > test/hbase-operator-tools/hbase-hbck2/target/hbase-hbck2-1.0.0-SNAPSHOT.jar > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Can you confirm it's the one in > > > > > > > > > > > > > > the bin tarball? what does the version command > > output? > > > > > What > > > > > > > > does > > > > > > > > > > the > > > > > > > > > > > > > > mapredcp command output? What does the cli help > > for the > > > > > > hbase > > > > > > > > > > command > > > > > > > > > > > > > > show? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > hbase@node2:~/hbase-2.2.0$ hbase mapredcp > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > /home/hbase/hbase-2.2.0/bin/../lib/shaded-clients/hbase-shaded-mapreduce-2.2.0.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/audience-annotations-0.5.0.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/commons-logging-1.2.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/findbugs-annotations-1.3.9-1.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/htrace-core4-4.2.0-incubating.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/log4j-1.2.17.jar:/home/hbase/hbase-2.2.0/bin/../lib/client-facing-thirdparty/slf4j-api-1.7.25.jar > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > that looks great now. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Once you correct the hbck2 jar above I think you'll be > > good > > > > > for > > > > > > > > > > invoking > > > > > > > > > > > > HBCK2. > > > > > > > > > > > > > > > > > > > > > > > > Next, what does the initializing master say it's > > doing? It > > > > > > should > > > > > > > > be > > > > > > > > > > > > on the master UI near the bottom. If it hasn't made > > > > progress > > > > > > > since > > > > > > > > > > > > your last update it'll be waiting for the > > hbase:namespace > > > > > > table. > > > > > > > > If it > > > > > > > > > > > > is, find the region and see what the last few messages > > in > > > > the > > > > > > > > master > > > > > > > > > > > > log are about that region. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The master died some times ago. It dies after 5 minutes. > > > > > > > > > > > > > > > > > > > > > > 2019-03-12 19:35:58,568 ERROR > > > > > > > [master/node2:60000:becomeActiveMaster] > > > > > > > > > > > master.HMaster: Failed to become active master > > > > > > > > > > > java.lang.IllegalStateException: Expected the service > > > > > > > > > > > ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the > > > > > service > > > > > > > has > > > > > > > > > > FAILED > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345) > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291) > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1341) > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1119) > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2347) > > > > > > > > > > > at > > > > > > > > > > > > org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:595) > > > > > > > > > > > at java.lang.Thread.run(Thread.java:748) > > > > > > > > > > > Caused by: java.io.IOException: Timedout 300000ms > > waiting for > > > > > > > > namespace > > > > > > > > > > > table to be assigned and enabled: > > tableName=hbase:namespace, > > > > > > > > > > state=ENABLED > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:108) > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63) > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226) > > > > > > > > > > > at > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1339) > > > > > > > > > > > ... 4 more > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I just restarted it. I can see the meta table being > > > > assigned. I > > > > > > can > > > > > > > > > > access > > > > > > > > > > > the WebUI and I don't see any initializing information. > > On > > > > the > > > > > > > table > > > > > > > > > > > section, I don't see anything, in any tab. However, when > > > > doing > > > > > > > > "list" on > > > > > > > > > > > the shell, I can see my tables. But I can not scan them. > > > > > Scanning > > > > > > > any > > > > > > > > > > table > > > > > > > > > > > gives : > > > > > > > > > > > hbase(main):001:0> scan 'hbase:namespace' > > > > > > > > > > > ROW COLUMN+CELL > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ERROR: Unknown table hbase:namespace! > > > > > > > > > > > > > > > > > > > > > > For usage try 'help "scan"' > > > > > > > > > > > > > > > > > > > > > > Took 1.0395 seconds > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > JMS > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Sean > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
