I know it. It is not that case. Look this logs: Region server: 2011-08-26 07:53:52,743 DEBUG org.apache.hadoop.hbase.regionserver.PriorityCompactionQueue: Create PriorityCompactionQueue 2011-08-26 07:53:52,743 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Runs every 10000000ms 2011-08-26 07:53:52,814 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Attempting connect to Master server at linux-kqm5:20000 2011-08-26 07:53:52,902 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Connected to master at linux-kqm5:20000
//told master that we are up 2011-08-26 07:53:52,931 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at linux-kqm5:20000 that we are up 2011-08-26 07:53:52,949 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us address to use. Was=linux-kqm6:20020, Now=linux-kqm6:20020 //init the service 2011-08-26 07:53:52,949 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: fs.default.name=hdfs://linux-kqm4:9000 2011-08-26 07:53:52,949 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=hdfs://linux-kqm4:9000/hbase 2011-08-26 07:53:52,989 INFO com.huawei.isap.ump.ha.client.ProxyRPCConfigurationUtil: Following server values are configured for the property:namenode.servers values:[hdfs://linux-kqm4:9000, hdfs://linux-kqm5:9000] 2011-08-26 07:53:53,061 INFO com.huawei.isap.ump.rpc.AbstractProxyFactory: DFSClient is in HA mode .Server Details:[linux-kqm4/192.200.10.11:9000, linux-kqm5/192.200.10.12:9000] 2011-08-26 07:53:53,069 WARN com.huawei.isap.ump.hdfs.nodestatus.NamenodeStatusListenerProvider: Unable to create NameNodeStatusListener with class name:[null] configured. 2011-08-26 07:53:53,077 INFO org.apache.hadoop.ipc.Client: The value of 'ipc.client.connect.max.socket.retries' is : 5 2011-08-26 07:53:53,077 INFO org.apache.hadoop.ipc.Client: The value of 'max.ping.retries.on.socket.timeout' is :1 2011-08-26 07:53:54,106 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 0 time(s). 2011-08-26 07:53:55,106 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 1 time(s). 2011-08-26 07:53:56,107 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 2 time(s). 2011-08-26 07:53:57,108 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 3 time(s). 2011-08-26 07:53:58,108 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 4 time(s). 2011-08-26 07:53:59,109 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 5 time(s). 2011-08-26 07:54:00,110 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 6 time(s). 2011-08-26 07:54:01,111 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 7 time(s). 2011-08-26 07:54:02,111 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 8 time(s). 2011-08-26 07:54:03,112 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: linux-kqm4/192.200.10.11:9000. Already tried 9 time(s). 2011-08-26 07:54:03,113 ERROR com.huawei.isap.ump.ha.client.HARPCConnector: Exception while getting the proxy for server: linux-kqm4/192.200.10.11:9000. Thread Name: regionserver20020, Reason: java.net.ConnectException: Connection refused 2011-08-26 07:54:03,132 INFO com.huawei.isap.ump.ha.client.HARPCConnector: New connection established with the configured server : ActiveServerConnectionInfo: Metadata:linux-kqm5/192.200.10.12:9000 Version:1678831321169382011-08-26 07:54:03,133 INFO org.apache.hadoop.hdfs.DFSClient: DFSClient has connected to the active namenode: linux-kqm5/192.200.10.12:9000 2011-08-26 07:54:03,137 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: logdir=hdfs://linux-kqm4:9000/hbase/.logs/linux-kqm6,20020,1314316432465 2011-08-26 07:54:03,329 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: HLog configuration: blocksize=256 MB, rollsize=243.2 MB, enabled=true, optionallogflushinternal=1000ms 2011-08-26 07:54:03,388 INFO org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs �C HDFS-200 2011-08-26 07:54:03,409 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: New hlog /hbase/.logs/linux-kqm6,20020,1314316432465/linux-kqm6%3A20020.1314316443329 2011-08-26 07:54:03,409 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Using getNumCurrentReplicas--HDFS-826 2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: revision 2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: hdfsUser 2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: hdfsDate 2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: hdfsUrl 2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: date 2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: hdfsRevision 2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: user 2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: hdfsVersion 2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: url 2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: MetricsString added: version 2011-08-26 07:54:03,418 INFO org.apache.hadoop.hbase.metrics: new MBeanInfo 2011-08-26 07:54:03,419 INFO org.apache.hadoop.hbase.metrics: new MBeanInfo 2011-08-26 07:54:03,419 INFO org.apache.hadoop.hbase.regionserver.metrics.RegionServerMetrics: Initialized 2011-08-26 07:54:03,422 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_OPEN_REGION-linux-kqm6,20020,1314316432465, corePoolSize=32, maxPoolSize=32 2011-08-26 07:54:03,422 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_OPEN_ROOT-linux-kqm6,20020,1314316432465, corePoolSize=1, maxPoolSize=1 2011-08-26 07:54:03,422 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_OPEN_META-linux-kqm6,20020,1314316432465, corePoolSize=1, maxPoolSize=1 2011-08-26 07:54:03,422 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_CLOSE_REGION-linux-kqm6,20020,1314316432465, corePoolSize=3, maxPoolSize=3 2011-08-26 07:54:03,422 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_CLOSE_ROOT-linux-kqm6,20020,1314316432465, corePoolSize=1, maxPoolSize=1 2011-08-26 07:54:03,422 DEBUG org.apache.hadoop.hbase.executor.ExecutorService: Starting executor service name=RS_CLOSE_META-linux-kqm6,20020,1314316432465, corePoolSize=1, maxPoolSize=1 2011-08-26 07:54:03,505 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2011-08-26 07:54:03,563 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 20030 2011-08-26 07:54:03,564 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 20030 webServer.getConnectors()[0].getLocalPort() returned 20030 2011-08-26 07:54:03,564 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 20030 2011-08-26 07:54:03,564 INFO org.mortbay.log: jetty-6.1.26 2011-08-26 07:54:03,856 INFO org.mortbay.log: Started [email protected]:20030 //init rpc finished and started to provide services 2011-08-26 07:54:03,858 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server listener on 20020: starting 2011-08-26 07:54:03,858 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server Responder: starting -----邮件原件----- 发件人: Ramkrishna S Vasudevan [mailto:[email protected]] 发送时间: 2011年9月6日 21:08 收件人: [email protected] 主题: RE: A issue about region server startup. Hi Gao That is because of the timeout.period that is set to default value of 30 mins. But why was the initialization of the service slow ? May be any exception while trying to open the root region? If that is the case you see HBASE-4287 is one solution. Without HBASE-4287 surely it will take 30 mins. Regards Ram -----Original Message----- From: Gaojinchao [mailto:[email protected]] Sent: Tuesday, September 06, 2011 5:27 PM To: [email protected] Subject: A issue about region server startup. In my cluster , I found a issue that root was assigned after 30 minutes when the cluster start up, because region server startup slowly. Region server start up : Step 1: report to master it has started up. Step 2: init the service Hmaster start up: Step 1:Wait for the region server. Step 2:Assign the root and meta. Because the preceding section, if the region server initializes service slowly. Master may throw exception "Server is not running yet". The root can't be assigned and need wait for 30 minutes. Is there any good way to modify this bug? Who can give me some suggestions? The logs: 2011-08-26 07:53:26,065 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for -ROOT-,,0.70236052 so generated a random one; hri=-ROOT-,,0.70236052, src=, dest=linux-kqm6,20020,1314316432465; 1 (online=1, exclude=null) available servers 2011-08-26 07:53:26,065 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region -ROOT-,,0.70236052 to linux-kqm6,20020,1314316432465 2011-08-26 07:53:26,065 DEBUG org.apache.hadoop.hbase.master.ServerManager: New connection to linux-kqm6,20020,1314316432465 2011-08-26 07:53:33,251 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: RemoteException connecting to RS org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy6.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393) at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation. getHRegionConnection(HConnectionManager.java:969) at org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManag er.java:620) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.ja va:555) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja va:1043) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja va:858) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja va:838) at org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManage r.java:1304) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:431) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388 ) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282) 2011-08-26 07:53:33,254 WARN org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of -ROOT-,,0.70236052 to serverName=linux-kqm6,20020,1314316432465, load=(requests=0, regions=0, usedHeap=0, maxHeap=0), trying to assign elsewhere instead; retry=0 org.apache.hadoop.hbase.ipc.ServerNotRunningException: org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running yet at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAcces sorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstruc torAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteE xceptionHandler.java:96) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation. getHRegionConnection(HConnectionManager.java:977) at org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManag er.java:620) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.ja va:555) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja va:1043) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja va:858) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.ja va:838) at org.apache.hadoop.hbase.master.AssignmentManager.assignRoot(AssignmentManage r.java:1304) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:431) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388 ) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282) 2011-08-26 07:53:33,254 WARN org.apache.hadoop.hbase.master.AssignmentManager: Unable to find a viable location to assign region -ROOT-,,0.70236052 2011-08-26 07:55:41,226 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=linux-kqm5,20020,1314316540832, regionCount=0, userLoad=false 2011-08-26 07:56:13,037 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=linux-kqm4,20020,1314316645788, regionCount=0, userLoad=false 2011-08-26 08:23:36,528 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: -ROOT-,,0.70236052 state=OFFLINE, ts=1314316413254 2011-08-26 08:23:36,529 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been OFFLINE for too long, reassigning -ROOT-,,0.70236052 to a random server 2011-08-26 08:23:36,529 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=-ROOT-,,0.70236052 state=OFFLINE, ts=1314316413254
