Hello,
    Thanks, J-D.
    We did the same test 3 days before, and got the same result: the master
killed itself after running for 2 days. Now we have 2 questions.
    1 Is it normal that the master killed itself so quickly? And if not,
what can we do to improve it?
    2 "Starting a Master on any node should be ok to recover, HBase is built
for that."
       Did you mean a master should be started automatically or we should
start a master by ourselves? By the way, what does ZK do? We thought ZK is
responsable for re-startint a master when the old one is dead. Is it?

    Thank you,
    LvZheng.

2009/8/16 Zheng Lv <[email protected]>

> Hello,
>     Thank you for your suggestions.
>     Several days before We found our routing talbe has some problems,
after
> adjusting now we are sure that the bandwidth is ok.
>     And we have used lzo compression.
>     So we started the test program again, but after running normally for
23
> hours, the master killed itself. Following is part of the log.
>     By the way, this time we inserted 10 webpages per second only.
> 2009-08-14 13:36:31,840 INFO org.apache.hadoop.hbase.master.ServerManager:
> 4
> region servers, 0 dead, average load 48.75
> 2009-08-14 13:36:32,016 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.metaScanner scanning meta region {server: 192.168.33.5:60020
> ,
> regionnam
> e: .META.,,1, startKey: <>}
> 2009-08-14 13:36:32,076 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.rootScanner scanning meta region {server: 192.168.33.6:60020
> ,
> regionnam
> e: -ROOT-,,0, startKey: <>}
> 2009-08-14 13:36:32,084 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.rootScanner scan of 1 row(s) of meta region {server:
> 192.168.33.6:60020
> , regionname: -ROOT-,,0, startKey: <>} complete
> 2009-08-14 13:36:32,316 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.metaScanner scan of 193 row(s) of meta region {server:
> 192.168.33.5:600
> 20, regionname: .META.,,1, startKey: <>} complete
> 2009-08-14 13:36:32,316 INFO org.apache.hadoop.hbase.master.BaseScanner:
> All
> 1 .META. region(s) scanned
> 2009-08-14 13:37:00,366 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80001 to sun.nio.ch.selectionkeyi...@4a407c9f
> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
> lim=4 cap=4]
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> 2009-08-14 13:37:00,881 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu3/192.168.33.8:2222
> 2009-08-14 13:37:04,366 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80000 to sun.nio.ch.selectionkeyi...@4ac6ee33
> java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
> lim=4 cap=4]
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> 2009-08-14 13:37:04,721 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu2/192.168.33.9:2222
> 2009-08-14 13:37:08,872 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80001 to sun.nio.ch.selectionkeyi...@2e93ebe0
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-14 13:37:08,873 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown output
> java.net.SocketException: Transport endpoint is not connected
>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>         at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> 2009-08-14 13:37:09,486 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu2/192.168.33.9:2222
> 2009-08-14 13:37:12,712 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80000 to sun.nio.ch.selectionkeyi...@7162d703
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-14 13:37:12,713 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown output
> java.net.SocketException: Transport endpoint is not connected
>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>         at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> 2009-08-14 13:37:13,032 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu3/192.168.33.8:2222
> 2009-08-14 13:37:17,482 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80001 to sun.nio.ch.selectionkeyi...@1012401d
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-14 13:37:17,483 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown output
> java.net.SocketException: Transport endpoint is not connected
>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>         at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> 2009-08-14 13:37:17,856 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu7/192.168.33.6:2222
> 2009-08-14 13:37:19,445 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.33.7:40923 remote
> =ubuntu7/192.168.33.6:2222]
> 2009-08-14 13:37:19,445 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-14 13:37:21,022 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80000 to sun.nio.ch.selectionkeyi...@2e101b3a
> java.io.IOException: TIMED OUT
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
> 2009-08-14 13:37:21,023 WARN org.apache.zookeeper.ClientCnxn: Ignoring
> exception during shutdown output
> java.net.SocketException: Transport endpoint is not connected
>         at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>         at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:956)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:922)
> 2009-08-14 13:37:21,908 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server ubuntu7/192.168.33.6:2222
> 2009-08-14 13:37:21,908 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 192.168.33.7:40926 remote
> =ubuntu7/192.168.33.6:2222]
> 2009-08-14 13:37:21,909 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2009-08-14 13:37:21,911 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x22313002be80000 to sun.nio.ch.selectionkeyi...@6bdfe124
> java.io.IOException: Session Expired
>         at
>
>
org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:548)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:661)
>         at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
> 2009-08-14 13:37:21,912 ERROR org.apache.hadoop.hbase.master.HMaster:
> Master
> lost its znode, killing itself now
> Regards,
> LvZheng
>

Reply via email to