i dont know how to describe my situation more. i just want to restart
successful again and get my data back.
1、bin/start-hbase.sh show all running.
2、bin/stop-hbase.sh can't stop normally.
3、regionserver cant see sometimes. after kill master process ,and restart
bin/start-hbase.sh ,it shows ok. but master can't work.
4、hadoop hdfs runs ok.and on port 50070 i can read /hbase folders.
5、here is my hbase-site.xml,and test1 and s1.idfs.cn is the same ip
192.168.1.122 ,first i set s1.idfs.cn on hbase.zookeeper.quorum but it only
know the hostname test1. s1.idfs.cn is based onmy dns.
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://s1.idfs.cn:9000/hbase</value>
<description>The directory shared by region servers.
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>
</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://s1.idfs.cn:9000</value>
<description></description>
</property>
<property>
<name>hbase.zookeeper.dns.nameserver</name>
<value>192.168.1.122</value>
<description></description>
</property>
<property>
<name>hbase.regionserver.dns.interface</name>
<value>192.168.1.122</value>
<description></description>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2222</value>
<description>Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>test1</value>
</property>
</configuration>
regionserver file is
s1.idfs.cn
s2.idfs.cn
hbase runs ok first time ,and i create tables and insert data.
6、i try to use bin/zkCli.sh -server 192.168.1.122:2222 to look at /hbase in
zookeeper ,maybe some useful info to you.thanks.
[zk: 192.168.1.122:2222(CONNECTED) 0] ls /
[hbase, zookeeper]
[zk: 192.168.1.122:2222(CONNECTED) 16] ls /hbase
[safe-mode, root-region-server, rs, master, shutdown]
see hbase in /
[zk: 192.168.1.122:2222(CONNECTED) 10] get /hbase/master
192.168.1.122:60000
cZxid = 0x1c
ctime = Thu Jun 24 14:39:21 CST 2010
mZxid = 0x1c
mtime = Thu Jun 24 14:39:21 CST 2010
pZxid = 0x1c
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x12968ae99ca0000
dataLength = 19
numChildren = 0
that 's my master 192.168.1.122
[zk: 192.168.1.122:2222(CONNECTED) 14] get /hbase/root-region-server
192.168.1.123:60020
cZxid = 0xa
ctime = Thu Jun 24 10:38:00 CST 2010
mZxid = 0x25
mtime = Thu Jun 24 14:39:31 CST 2010
pZxid = 0xa
cversion = 0
dataVersion = 1
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 19
numChildren = 0
i set two region servers but here just one.
[zk: 192.168.1.122:2222(CONNECTED) 11] get /hbase/shutdown
up
cZxid = 0x1d
ctime = Thu Jun 24 14:39:21 CST 2010
mZxid = 0x1d
mtime = Thu Jun 24 14:39:21 CST 2010
pZxid = 0x1d
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 2
numChildren = 0
[zk: 192.168.1.122:2222(CONNECTED) 12] get /hbase/rs
cZxid = 0x6
ctime = Thu Jun 24 10:37:28 CST 2010
mZxid = 0x6
mtime = Thu Jun 24 10:37:28 CST 2010
pZxid = 0x21
cversion = 6
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 2
[zk: 192.168.1.122:2222(CONNECTED) 19] ls /hbase/safe-mode
[]
2010/6/24 梁景明 <[email protected]>
> and more details, when i kill the process of hbase. restart it again
> ,regionserver on 60030 can see,it started ok.
> ,but master on 60010 show this . and the data /hbase still in hadoop hdfs.
> that 's what i want to say.
> the data /hbase stays ,but i can't find any way to start hbase again.
>
>
> HTTP ERROR: 500
>
> Trying to contact region server null for region , row '', but failed after 3
> attempts.
> Exceptions:
>
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying
> to locate root region because: Failed setting up proxy to
> /192.168.1.123:60020 after attempts=1
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying
> to locate root region because: Failed setting up proxy to
> /192.168.1.123:60020 after attempts=1
>
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying
> to locate root region because: Failed setting up proxy to
> /192.168.1.123:60020 after attempts=1
>
> RequestURI=/master.jsp
> Caused by:
>
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
> region server null for region , row '', but failed after 3 attempts.
> Exceptions:
>
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying
> to locate root region because: Failed setting up proxy to
> /192.168.1.123:60020 after attempts=1
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying
> to locate root region because: Failed setting up proxy to
> /192.168.1.123:60020 after attempts=1
>
> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying
> to locate root region because: Failed setting up proxy to
> /192.168.1.123:60020 after attempts=1
>
> at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1055)
>
> at
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:75)
> at
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:48)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.listTables(HConnectionManager.java:454)
>
> at
> org.apache.hadoop.hbase.client.HBaseAdmin.listTables(HBaseAdmin.java:127)
> at
> org.apache.hadoop.hbase.generated.master.master_jsp._jspService(master_jsp.java:132)
> at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
>
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
> at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
>
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>
> at org.mortbay.jetty.Server.handle(Server.java:324)
> at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
>
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
> at
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
>
> at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
>
> *Powered by Jetty:// <http://jetty.mortbay.org/>*
>
>
>
>
>
>
>
>
>
>
> 2010/6/24 梁景明 <[email protected]>
>
> exactly like this . it 's some problem with zookeeper, i am not sure what
>> happen to zookeeper,
>> it is all started .but port 60030 and 60010 not ok.
>>
>> ---------------------------------------------------------------------------
>> futur...@test1:~/hbase$ bin/start-hbase.sh
>> test1: zookeeper running as process 18596. Stop it first.
>> master running as process 20047. Stop it first.
>> s1.idfs.cn: regionserver running as process 18829. Stop it first.
>> s2.idfs.cn: regionserver running as process 18763. Stop it first.
>>
>> ------------------------------------------------------------------------------------------
>>
>> and logs in hbase give me the following, and i dont know how to deal with
>> it.if zookeeper is dead or goes with some problems,
>> how do i do> stop-hbase.sh & start-hbase.sh don't work at all
>>
>>
>> ------------------------------------------------------------------------------------------------------------
>> 2010-06-24 11:33:29,713 WARN
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to create /hbase
>> -- check quorum servers, currently=test1:2222
>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for /hbase
>> at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>> at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:780)
>> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:808)
>> at
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureExists(ZooKeeperWrapper.java:405)
>> at
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.ensureParentExists(ZooKeeperWrapper.java:432)
>> at
>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.writeMasterAddress(ZooKeeperWrapper.java:520)
>> at
>> org.apache.hadoop.hbase.master.HMaster.writeAddressToZooKeeper(HMaster.java:260)
>> at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:242)
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>> at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>> at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>> at org.apache.hadoop.hbase.master.HMaster.doMain(HMaster.java:1230)
>> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1271)
>> 2010-06-24 11:33:31,202 INFO org.apache.zookeeper.ClientCnxn: Attempting
>> connection to server test1/192.168.1.122:2222
>> 2010-06-24 11:33:31,203 INFO org.apache.zookeeper.ClientCnxn: Priming
>> connection to java.nio.channels.SocketChannel[connected local=/
>> 192.168.1.122:52706 remote=test1/192.168.1.122:2222]
>> 2010-06-24 11:33:31,203 INFO org.apache.zookeeper.ClientCnxn: Server
>> connection successful
>> 2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Exception
>> closing session 0x0 to sun.nio.ch.selectionkeyi...@163f7a1
>> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
>> lim=4 cap=4]
>> at
>> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:701)
>> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)
>> 2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>> exception during shutdown input
>> java.net.SocketException: Transport endpoint is not connected
>> at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>> at
>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
>> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>> at
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999)
>> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>> 2010-06-24 11:33:31,204 WARN org.apache.zookeeper.ClientCnxn: Ignoring
>> exception during shutdown output
>> java.net.SocketException: Transport endpoint is not connected
>> at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>> at
>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>> at
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004)
>> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>
>>
>> 2010/6/22 Jean-Daniel Cryans <[email protected]>
>>
>> I'm not sure I understand what you describe, and since you didn't post
>>> any output from your logs then it's really hard to help you debug.
>>>
>>> What's the problem exactly and do you see any exception in the logs?
>>>
>>> J-D
>>>
>>> On Mon, Jun 21, 2010 at 2:48 AM, 梁景明 <[email protected]> wrote:
>>> > after reading "Description of how HBase uses ZooKeeper"i see my problem
>>> > maybe that the regionserver session in zk is lost!
>>> >
>>> > and i use bin/start-hbase.sh cant start hbase successfully .
>>> >
>>> > because they connect to zookeeper something lost?
>>> >
>>> > to start it.one way i think zookeeper start alone ,and i delete
>>> "/hbase" in
>>> > it , and run the start-hbase.sh shell again?
>>> >
>>> > will it be ok?
>>> >
>>> > 2010/6/19 Jean-Daniel Cryans <[email protected]>
>>> >
>>> >> > do u mean if ZooKeeper is dead,the data will lose?
>>> >>
>>> >> If your Zookeeper ensemble is dead, then HBase will be unavailable but
>>> >> you won't lose any data. And even if your zookeeper data is wiped out,
>>> >> like I said it's only runtime data so it doesn't matter.
>>> >>
>>> >> >
>>> >> > in that case,ZooKeeper lost .META or .ROOT ,the data in hadoop will
>>> never
>>> >> be
>>> >> > recover , thought there were some table folders in hadoop.
>>> >>
>>> >> HBase stores the location of -ROOT- in Zookeeper, and that's changed
>>> >> everytime the region moves. Losing that won't make -ROOT- disappear
>>> >> forever, it's still in HDFS.
>>> >>
>>> >> Does it answer the question? (I'm not sure I fully understand you)
>>> >>
>>> >> J-D
>>> >>
>>> >
>>>
>>
>>
>