I found this. http://issues.apache.org/jira/browse/ZOOKEEPER-86
It looks like the same error I had. Is it a zookeeper bug? When will HBase take the zookeeper version 3.3.0? thanks zhenyu On Sat, Nov 14, 2009 at 11:31 AM, Zhenyu Zhong <[email protected]>wrote: > Now I really doubt about the zookeeper, from the log I saw errors like > IOException Read Error, while the zookeeper client (regionserver ) tried to > read. But it got disconneted status from zookeeper. > > I don't see any load on any zookeeper quorum servers. DiskIO is fine. > > Also when the regionserver decides to exit due to the disconnect status > from zookeeper, sometimes the regionserver hangs during the exiting. We can > still see the HRegionServer process even we don't see it in the Master web > interface. > > I also notice that there is an zookeeper.retries and zookeeper.pauses > settings in hbase-default.xml, would that matter ? > > thanks > > You may see the IOException error in the following log when the > Regionserver lost connection to the zookeeper. > > 2009-11-14 15:58:31,993 INFO org.apache.hadoop.hbase.regionserver.HLog: > Roll > /hbase/.logs/superpyxis0008.scur.colo,60021,1258179101617/hlog.dat.1258179106921, > entries=1062964, calcsize=255014936, filesize=139566093. New hlog > /hbase/.logs/superpyxis0008.scur.colo,60021,1258179101617/hlog.dat.1258214311993 > 2009-11-14 15:58:39,295 DEBUG org.apache.zookeeper.ClientCnxn: Got ping > response for sessionid:0x424f0ed01ea00a6 after 0ms > 2009-11-14 15:58:52,648 DEBUG org.apache.zookeeper.ClientCnxn: Got ping > response for sessionid:0x424f0ed01ea00a6 after 0ms > 2009-11-14 15:59:06,007 DEBUG org.apache.zookeeper.ClientCnxn: Got ping > response for sessionid:0x424f0ed01ea00a6 after 0ms > 2009-11-14 15:59:19,365 DEBUG org.apache.zookeeper.ClientCnxn: Got ping > response for sessionid:0x424f0ed01ea00a6 after 0ms > 2009-11-14 15:59:32,867 DEBUG org.apache.zookeeper.ClientCnxn: Got ping > response for sessionid:0x424f0ed01ea00a6 after 186ms > 2009-11-14 15:59:46,070 DEBUG org.apache.zookeeper.ClientCnxn: Got ping > response for sessionid:0x424f0ed01ea00a6 after 0ms > 2009-11-14 15:59:59,423 DEBUG org.apache.zookeeper.ClientCnxn: Got ping > response for sessionid:0x424f0ed01ea00a6 after 0ms > 2009-11-14 16:00:12,775 DEBUG org.apache.zookeeper.ClientCnxn: Got ping > response for sessionid:0x424f0ed01ea00a6 after 0ms > 2009-11-14 16:00:28,141 DEBUG org.apache.zookeeper.ClientCnxn: Got ping > response for sessionid:0x424f0ed01ea00a6 after 2010ms > 2009-11-14 16:01:10,378 WARN org.apache.zookeeper.ClientCnxn: Exception > closing session 0x424f0ed01ea00a6 to sun.nio.ch.selectionkeyi...@39fbb2d6 > > java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0 > lim=4 cap=4] > at > org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:701) > at > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945) > 2009-11-14 16:01:10,478 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, > state: Disconnected, type: None, path: null > 2009-11-14 16:01:11,333 WARN > org.apache.hadoop.hbase.regionserver.HRegionServer: Attempt=1 > > org.apache.hadoop.hbase.Leases$LeaseStillHeldException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > at > org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94) > at > org.apache.hadoop.hbase.RemoteExceptionHandler.checkThrowable(RemoteExceptionHandler.java:48) > at > org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:66) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:571) > at java.lang.Thread.run(Thread.java:619) > 2009-11-14 16:01:11,433 INFO org.apache.zookeeper.ClientCnxn: Attempting > connection to server superpyxis0001/192.168.100.115:2181 > 2009-11-14 16:01:11,433 INFO org.apache.zookeeper.ClientCnxn: Priming > connection to java.nio.channels.SocketChannel[connected local=/ > 192.168.100.122:59575 remote=superpyxis0001/192.168.100.115:2181] > 2009-11-14 16:01:11,433 INFO org.apache.zookeeper.ClientCnxn: Server > connection successful > 2009-11-14 16:01:11,433 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, > state: Expired, type: None, path: null > 2009-11-14 16:01:11,433 WARN org.apache.zookeeper.ClientCnxn: Exception > closing session 0x424f0ed01ea00a6 to sun.nio.ch.selectionkeyi...@6b93d343 > java.io.IOException: Session Expired > at > org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:589) > at > org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:709) > > at > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945) > 2009-11-14 16:01:11,433 ERROR > org.apache.hadoop.hbase.regionserver.HRegionServer: ZooKeeper session > expired > 2009-11-14 16:01:13,513 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 3 on 60021, call put([...@73deb204, > [Lorg.apache.hadoop.hbase.client.Put;@2179600a) from 192.168.100.132:40728: > error: java.io.IOException: Server not running, aborting > java.io.IOException: Server not running, aborting > > > > > > On Sat, Nov 14, 2009 at 1:01 AM, Zhenyu Zhong <[email protected]>wrote: > >> Stack, >> >> Thanks a lot! >> >> I found out that the reason why HBase doesn't take the system file >> descriptor value I set before. I started the HBase using root instead of the >> normal hadoop user, while my system configures hadoop with higher file >> descriptor value but configures root with 1024 default value. >> >> >> Now my system has a clean start. However, it seems that the >> zookeeper.session.timeout value doesn't take into effect. I still found >> around 60 seconds pauses from the disconnected regionserver. I really don't >> know why regionserver only times out after 60 seconds instead of 10 minutes >> which I set for zookeeper.session.timeout. >> >> Is there any other timeout value coming into play before the actual >> session times out? >> >> zhenyu >> >> >> >> On Fri, Nov 13, 2009 at 7:08 PM, stack <[email protected]> wrote: >> >>> Ok. Lack of file descriptors manifests in all kinds of weird ways. >>> Hopefully thats it. If not, lets keep digging. >>> St.Ack >>> >>> On Fri, Nov 13, 2009 at 3:44 PM, Zhenyu Zhong <[email protected] >>> >wrote: >>> >>> > Stack, >>> > >>> > You are right, the master started with ulimit -n 1024. It doesn't take >>> the >>> > system value. >>> > >>> > Regarding to the too many open files, it looks like the same as the one >>> J-D >>> > put up. But I will get the Master start with higher value first and see >>> if >>> > this kind of error goes away. >>> > >>> > thanks a lot! >>> > zhenyu >>> > >>> > On Fri, Nov 13, 2009 at 6:02 PM, stack <[email protected]> wrote: >>> > >>> > > Does it say >>> > > >>> > > ulimit -n 32768 >>> > > >>> > > ...as the second line in your log file on start of the master? >>> > > >>> > > Is the stack trace that complains about too many open files same as >>> the >>> > one >>> > > in the blog post J-D put up? >>> > > >>> > > St.Ack >>> > > >>> > > >>> > > On Fri, Nov 13, 2009 at 1:37 PM, Zhenyu Zhong < >>> [email protected] >>> > > >wrote: >>> > > >>> > > > The ulimit file descriptors was set to fs.file-max =1578334, also >>> in >>> > the >>> > > > limits.conf the value was set to 32768. >>> > > > So these are way higher than the open descriptors for the running >>> > > > processes. >>> > > > >>> > > > thanks >>> > > > zhenyu >>> > > > >>> > > > >>> > > > On Fri, Nov 13, 2009 at 4:33 PM, Stack <[email protected]> >>> wrote: >>> > > > >>> > > > > You upped the ulimit file descriptors as per the getting started >>> doc? >>> > > > > >>> > > > > >>> > > > > >>> > > > > On Nov 13, 2009, at 1:26 PM, Zhenyu Zhong < >>> [email protected]> >>> > > > wrote: >>> > > > > >>> > > > > Thanks a lot. >>> > > > >> >>> > > > >> >>> > > > >> Bad news is my kernel is still 2.6.26. >>> > > > >> But it was not a problem before. >>> > > > >> >>> > > > >> Very strange. >>> > > > >> >>> > > > >> zhenyu >>> > > > >> >>> > > > >> On Fri, Nov 13, 2009 at 4:16 PM, Jean-Daniel Cryans < >>> > > > [email protected] >>> > > > >> >wrote: >>> > > > >> >>> > > > >> Looks like >>> > > > >>> >>> > > > >>> >>> > > > >>> > > >>> > >>> http://pero.blogs.aprilmayjune.org/2009/01/22/hadoop-and-linux-kernel-2627-epoll-limits/ >>> > > > >>> >>> > > > >>> J-D >>> > > > >>> >>> > > > >>> On Fri, Nov 13, 2009 at 1:12 PM, Zhenyu Zhong < >>> > > [email protected] >>> > > > > >>> > > > >>> wrote: >>> > > > >>> >>> > > > >>>> Hi, >>> > > > >>>> >>> > > > >>>> >>> > > > >>>> After I re-organize the cluster, the experiment ran into >>> problem >>> > > > faster >>> > > > >>>> >>> > > > >>> than >>> > > > >>> >>> > > > >>>> before. >>> > > > >>>> >>> > > > >>>> Basically the changes are to use machines with less resources >>> as >>> > > > >>>> >>> > > > >>> zookeeper >>> > > > >>> >>> > > > >>>> quorums and machines with more resources as regionserver. >>> > > > >>>> >>> > > > >>>> From the log, I still see the pause around 1 minute. >>> > > > >>>> I enabled the GC logging, please see >>> > > > >>>> >>> > > > >>>> http://pastebin.com/m1d4ce0f1 >>> > > > >>>> >>> > > > >>>> for details. >>> > > > >>>> In general I don't see many pauses in the GC. >>> > > > >>>> >>> > > > >>>> What is more interesting, 1 hour after the 1st regionserver >>> > > > >>>> disconnected, >>> > > > >>>> the master log complained about too many open files. This >>> didn't >>> > > > happen >>> > > > >>>> before. >>> > > > >>>> I checked the system OS setup as well as the limits.conf. I >>> also >>> > > > checked >>> > > > >>>> >>> > > > >>> the >>> > > > >>> >>> > > > >>>> running processes. The total open files don't reach the limit. >>> I >>> > am >>> > > > >>>> >>> > > > >>> confused >>> > > > >>> >>> > > > >>>> a bit. >>> > > > >>>> >>> > > > >>>> >>> > > > >>>> Please see the following master log. >>> > > > >>>> >>> > > > >>>> 2009-11-13 20:06:02,114 INFO >>> > > > org.apache.hadoop.hbase.master.BaseScanner: >>> > > > >>>> RegionManager.metaScanner scan of 4658 row(s) of meta region >>> > > {server: >>> > > > >>>> 192.168.100.128:60021, regionname: .META.,,1, startKey: <>} >>> > > complete >>> > > > >>>> 2009-11-13 20:06:02,114 INFO >>> > > > org.apache.hadoop.hbase.master.BaseScanner: >>> > > > >>>> >>> > > > >>> All >>> > > > >>> >>> > > > >>>> 1 .META. region(s) scanned >>> > > > >>>> 2009-11-13 20:06:07,677 DEBUG org.apache.zookeeper.ClientCnxn: >>> Got >>> > > > ping >>> > > > >>>> response for sessionid:0x424eebf1c10004c after 3ms >>> > > > >>>> 2009-11-13 20:06:08,178 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > Exception >>> > > > >>>> >>> > > > >>> in >>> > > > >>> >>> > > > >>>> createBlockOutputStream java.io.IOException: Bad connect ack >>> with >>> > > > >>>> firstBadLink 192.168.100.123:50010 >>> > > > >>>> 2009-11-13 20:06:08,178 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > >>>> Abandoning >>> > > > >>>> block blk_-2808245019291145247_5478039 >>> > > > >>>> 2009-11-13 20:06:09,682 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > Exception >>> > > > >>>> >>> > > > >>> in >>> > > > >>> >>> > > > >>>> createBlockOutputStream java.io.EOFException >>> > > > >>>> 2009-11-13 20:06:09,682 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > >>>> Abandoning >>> > > > >>>> block blk_1074853606841896259_5478048 >>> > > > >>>> 2009-11-13 20:06:10,334 DEBUG org.apache.zookeeper.ClientCnxn: >>> Got >>> > > > ping >>> > > > >>>> response for sessionid:0x24eebf1043003c after 1ms >>> > > > >>>> 2009-11-13 20:06:21,018 DEBUG org.apache.zookeeper.ClientCnxn: >>> Got >>> > > > ping >>> > > > >>>> response for sessionid:0x424eebf1c10004c after 0ms >>> > > > >>>> 2009-11-13 20:06:23,674 DEBUG org.apache.zookeeper.ClientCnxn: >>> Got >>> > > > ping >>> > > > >>>> response for sessionid:0x24eebf1043003c after 0ms >>> > > > >>>> 2009-11-13 20:06:24,828 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > Exception >>> > > > >>>> >>> > > > >>> in >>> > > > >>> >>> > > > >>>> createBlockOutputStream java.io.IOException: Bad connect ack >>> with >>> > > > >>>> firstBadLink 192.168.100.123:50010 >>> > > > >>>> 2009-11-13 20:06:24,828 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > >>>> Abandoning >>> > > > >>>> block blk_-6642544517082142289_5478063 >>> > > > >>>> 2009-11-13 20:06:24,828 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > Exception >>> > > > >>>> >>> > > > >>> in >>> > > > >>> >>> > > > >>>> createBlockOutputStream java.net.SocketException: Too many >>> open >>> > > files >>> > > > >>>> 2009-11-13 20:06:24,828 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > >>>> Abandoning >>> > > > >>>> block blk_2057511041109796090_5478063 >>> > > > >>>> 2009-11-13 20:06:24,928 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > Exception >>> > > > >>>> >>> > > > >>> in >>> > > > >>> >>> > > > >>>> createBlockOutputStream java.net.SocketException: Too many >>> open >>> > > files >>> > > > >>>> 2009-11-13 20:06:24,928 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > >>>> Abandoning >>> > > > >>>> block blk_8219260302213892894_5478064 >>> > > > >>>> 2009-11-13 20:06:30,855 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > Exception >>> > > > >>>> >>> > > > >>> in >>> > > > >>> >>> > > > >>>> createBlockOutputStream java.net.SocketException: Too many >>> open >>> > > files >>> > > > >>>> 2009-11-13 20:06:30,855 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > >>>> Abandoning >>> > > > >>>> block blk_1669205542853067709_5478235 >>> > > > >>>> 2009-11-13 20:06:30,905 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > Exception >>> > > > >>>> >>> > > > >>> in >>> > > > >>> >>> > > > >>>> createBlockOutputStream java.net.SocketException: Too many >>> open >>> > > files >>> > > > >>>> 2009-11-13 20:06:30,905 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > >>>> Abandoning >>> > > > >>>> block blk_9128897691346270351_5478237 >>> > > > >>>> 2009-11-13 20:06:30,955 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > Exception >>> > > > >>>> >>> > > > >>> in >>> > > > >>> >>> > > > >>>> createBlockOutputStream java.net.SocketException: Too many >>> open >>> > > files >>> > > > >>>> 2009-11-13 20:06:30,955 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > >>>> Abandoning >>> > > > >>>> block blk_1116845144864123018_5478240 >>> > > > >>>> 2009-11-13 20:06:34,372 DEBUG org.apache.zookeeper.ClientCnxn: >>> Got >>> > > > ping >>> > > > >>>> response for sessionid:0x424eebf1c10004c after 0ms >>> > > > >>>> 2009-11-13 20:06:37,034 DEBUG org.apache.zookeeper.ClientCnxn: >>> Got >>> > > > ping >>> > > > >>>> response for sessionid:0x24eebf1043003c after 0ms >>> > > > >>>> 2009-11-13 20:06:37,235 WARN org.apache.hadoop.hdfs.DFSClient: >>> > > > >>>> >>> > > > >>> DataStreamer >>> > > > >>> >>> > > > >>>> Exception: java.io.IOException: Too many open files >>> > > > >>>> at sun.nio.ch.IOUtil.initPipe(Native Method) >>> > > > >>>> at >>> > > sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:49) >>> > > > >>>> at >>> > > > >>>> >>> > > > >>>> >>> > > > >>> > > >>> > >>> sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18) >>> > > > >>> >>> > > > >>>> at >>> > > > >>>> >>> > > > >>>> >>> > > > >>> > > >>> > >>> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.get(SocketIOWithTimeout.java:407) >>> > > > >>> >>> > > > >>>> at >>> > > > >>>> >>> > > > >>>> >>> > > > >>> > > >>> > >>> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:322) >>> > > > >>> >>> > > > >>>> at >>> > > > >>>> >>> > > > >>>> >>> > > > >>> > > >>> > >>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) >>> > > > >>> >>> > > > >>>> at >>> > > > >>>> >>> > > > >>>> >>> > > > >>> > > >>> > >>> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) >>> > > > >>> >>> > > > >>>> at >>> > > > >>>> >>> > > > >>>> >>> > > > >>> > > >>> > >>> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) >>> > > > >>> >>> > > > >>>> at >>> > > > >>>> >>> > > > >>> >>> java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) >>> > > > >>> >>> > > > >>>> at >>> java.io.DataOutputStream.write(DataOutputStream.java:90) >>> > > > >>>> at >>> > > > >>>> >>> > > > >>>> >>> > > > >>> > > >>> > >>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2290) >>> > > > >>> >>> > > > >>>> >>> > > > >>>> 2009-11-13 20:06:37,235 WARN org.apache.hadoop.hdfs.DFSClient: >>> > Error >>> > > > >>>> Recovery for block blk_8148813491785406356_5478475 bad >>> datanode[0] >>> > > > >>>> 192.168.100.123:50010 >>> > > > >>>> 2009-11-13 20:06:37,235 WARN org.apache.hadoop.hdfs.DFSClient: >>> > Error >>> > > > >>>> Recovery for block blk_8148813491785406356_5478475 in pipeline >>> > > > >>>> 192.168.100.123:50010, 192.168.100.134:50010, >>> > 192.168.100.122:50010 >>> > > : >>> > > > >>>> bad >>> > > > >>>> datanode 192.168.100.123:50010 >>> > > > >>>> 2009-11-13 20:06:37,436 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > Exception >>> > > > >>>> >>> > > > >>> in >>> > > > >>> >>> > > > >>>> createBlockOutputStream java.net.SocketException: Too many >>> open >>> > > files >>> > > > >>>> 2009-11-13 20:06:37,436 INFO org.apache.hadoop.hdfs.DFSClient: >>> > > > >>>> Abandoning >>> > > > >>>> block blk_2119727700857186236_5478498 >>> > > > >>>> >>> > > > >>>> >>> > > > >>>> >>> > > > >>>> >>> > > > >>>> >>> > > > >>>> On Thu, Nov 12, 2009 at 4:21 PM, Zhenyu Zhong < >>> > > > [email protected] >>> > > > >>>> wrote: >>> > > > >>>> >>> > > > >>>> Will do. >>> > > > >>>>> >>> > > > >>>>> thanks >>> > > > >>>>> zhenyu >>> > > > >>>>> >>> > > > >>>>> >>> > > > >>>>> On Thu, Nov 12, 2009 at 3:33 PM, stack <[email protected]> >>> wrote: >>> > > > >>>>> >>> > > > >>>>> Enable GC logging too on this next run (see hbase-env.sh). >>> Lets >>> > > try >>> > > > >>>>>> >>> > > > >>>>> and >>> > > > >>> >>> > > > >>>> get >>> > > > >>>>>> to the bottom of whats going on. >>> > > > >>>>>> Thanks, >>> > > > >>>>>> St.Ack >>> > > > >>>>>> >>> > > > >>>>>> On Thu, Nov 12, 2009 at 12:29 PM, Zhenyu Zhong < >>> > > > >>>>>> >>> > > > >>>>> [email protected] >>> > > > >>> >>> > > > >>>> wrote: >>> > > > >>>>>>> >>> > > > >>>>>> >>> > > > >>>>>> I can switch the boxes that run zookeeper with the ones >>> that >>> > run >>> > > > >>>>>>> regionservers. >>> > > > >>>>>>> I will see the results later. >>> > > > >>>>>>> >>> > > > >>>>>>> FYI. The node does have the 10 minutes >>> > zookeeper.session.timeout >>> > > > >>>>>>> >>> > > > >>>>>> value >>> > > > >>> >>> > > > >>>> in >>> > > > >>>>>> >>> > > > >>>>>>> place. >>> > > > >>>>>>> >>> > > > >>>>>>> thanks >>> > > > >>>>>>> zhenyu >>> > > > >>>>>>> >>> > > > >>>>>>> On Thu, Nov 12, 2009 at 3:21 PM, stack <[email protected]> >>> > wrote: >>> > > > >>>>>>> >>> > > > >>>>>>> On Thu, Nov 12, 2009 at 11:50 AM, Zhenyu Zhong < >>> > > > >>>>>>>> >>> > > > >>>>>>> [email protected] >>> > > > >>>>>> >>> > > > >>>>>>> wrote: >>> > > > >>>>>>>>> >>> > > > >>>>>>>> >>> > > > >>>>>>>> >>> > > > >>>>>>>>> In my cluster, half of the cluster have 2 disks 400GB >>> each >>> > per >>> > > > >>>>>>>>> >>> > > > >>>>>>>> machine, >>> > > > >>>>>> >>> > > > >>>>>>> and >>> > > > >>>>>>>> >>> > > > >>>>>>>>> half of the cluster have 6 disks per machine. Maybe we >>> > should >>> > > > >>>>>>>>> >>> > > > >>>>>>>> run >>> > > > >>> >>> > > > >>>> zookeeper >>> > > > >>>>>>>>> on the machines with 2 disks and RS on machines with 6 >>> disks? >>> > > > >>>>>>>>> >>> > > > >>>>>>>>> That would make most sense only in the below, it looks >>> like >>> > the >>> > > > >>>>>>>>> >>> > > > >>>>>>>> RS >>> > > > >>> >>> > > > >>>> that >>> > > > >>>>>> >>> > > > >>>>>>> had >>> > > > >>>>>>>> issue had 4 disks? >>> > > > >>>>>>>> >>> > > > >>>>>>>> >>> > > > >>>>>>>> >>> > > > >>>>>>>> BTW, the 10 minutes zookeeper.session.timeout has been >>> set >>> > > during >>> > > > >>>>>>>>> >>> > > > >>>>>>>> the >>> > > > >>>>>> >>> > > > >>>>>>> experiment. >>> > > > >>>>>>>>> >>> > > > >>>>>>>>> >>> > > > >>>>>>>> And for sure this node had it in place? >>> > > > >>>>>>>> St.Ack >>> > > > >>>>>>>> >>> > > > >>>>>>>> >>> > > > >>>>>>>> >>> > > > >>>>>>>> >>> > > > >>>>>>>>> thanks >>> > > > >>>>>>>>> zhenyu >>> > > > >>>>>>>>> >>> > > > >>>>>>>>> On Thu, Nov 12, 2009 at 2:08 PM, stack <[email protected] >>> > >>> > > wrote: >>> > > > >>>>>>>>> >>> > > > >>>>>>>>> On Thu, Nov 12, 2009 at 8:40 AM, Zhenyu Zhong < >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> [email protected] >>> > > > >>>>>>> >>> > > > >>>>>>>> wrote: >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> Though I experienced 2 regionserver disconnection this >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> morning, >>> > > > >>> >>> > > > >>>> it >>> > > > >>>>>> >>> > > > >>>>>>> looks >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> better from the regionserver log. (please see the >>> following >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> log) >>> > > > >>> >>> > > > >>>> >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>>> http://pastebin.com/m496dbfae >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>>> I checked diskIO for one of the >>> > regionserver(192.168.100.116) >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> during >>> > > > >>>>>>> >>> > > > >>>>>>>> the >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> time it disconnected. >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>>> Time: 03:04:58 AM >>> > > > >>>>>>>>>>> Device: tps Blk_read/s Blk_wrtn/s >>> Blk_read >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> Blk_wrtn >>> > > > >>>>>>>> >>> > > > >>>>>>>>> sda 105.31 5458.83 19503.64 >>> 9043873239 >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> 32312473676 >>> > > > >>>>>>>> >>> > > > >>>>>>>>> sda1 2.90 6.64 99.25 >>> 10993934 >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> 164433464 >>> > > > >>>>>>>> >>> > > > >>>>>>>>> sda2 1.72 23.76 30.25 >>> 39365817 >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> 50124008 >>> > > > >>>>>>>> >>> > > > >>>>>>>>> sda3 0.30 0.38 3.58 >>> 624930 >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> 5923000 >>> > > > >>>>>>>> >>> > > > >>>>>>>>> sda4 100.39 5428.06 19370.56 >>> 8992888270 >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> 32091993204 >>> > > > >>>>>>>> >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>>> Is this high for you? 20k blocks/second would seem to >>> be >>> > > high >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> but >>> > > > >>> >>> > > > >>>> its >>> > > > >>>>>>> >>> > > > >>>>>>>> one >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> disk only and its not being shared by zk anymore so >>> > shouldn't >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> matter? >>> > > > >>>>>> >>> > > > >>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> I also checked the zookeeper quorum server that the >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> regionserver >>> > > > >>> >>> > > > >>>> tried >>> > > > >>>>>>>> >>> > > > >>>>>>>>> to >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> connect according to the log. However, I don't see >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> 192.168.100.116 >>> > > > >>>>>> >>> > > > >>>>>>> in >>> > > > >>>>>>> >>> > > > >>>>>>>> the >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> client list of the zookeeper quorum server that the >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> regionserver >>> > > > >>> >>> > > > >>>> tried >>> > > > >>>>>>>> >>> > > > >>>>>>>>> to >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> connect. >>> > > > >>>>>>>>>>> Would that be a problem? >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>>> Is that because the ephemeral node for the >>> regionserver >>> > had >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> evaporated? >>> > > > >>>>>>> >>> > > > >>>>>>>> Lost >>> > > > >>>>>>>>>> its lease w/ zk by the time you went to look? >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> Thu Nov 12 15:04:52 UTC 2009 >>> > > > >>>>>>>>>>> Zookeeper version: 3.2.1-808558, built on 08/27/2009 >>> 18:48 >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> GMT >>> > > > >>> >>> > > > >>>> Clients: >>> > > > >>>>>>>>>>> /192.168.100.127:43045[1](queued=0,recved=26,sent=0) >>> > > > >>>>>>>>>>> /192.168.100.131:39091[1](queued=0,recved=964,sent=0) >>> > > > >>>>>>>>>>> /192.168.100.124:35961[1](queued=0,recved=3266,sent=0) >>> > > > >>>>>>>>>>> /192.168.100.123:47935[1](queued=0,recved=233,sent=0) >>> > > > >>>>>>>>>>> /192.168.100.125:46931[1](queued=0,recved=2,sent=0) >>> > > > >>>>>>>>>>> /192.168.100.118:54924[1](queued=0,recved=295,sent=0) >>> > > > >>>>>>>>>>> /192.168.100.118:41390[1](queued=0,recved=2290,sent=0) >>> > > > >>>>>>>>>>> /192.168.100.136:42243[1](queued=0,recved=0,sent=0) >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>>> Latency min/avg/max: 0/17/6333 >>> > > > >>>>>>>>>>> Received: 47111 >>> > > > >>>>>>>>>>> Sent: 0 >>> > > > >>>>>>>>>>> Outstanding: 0 >>> > > > >>>>>>>>>>> Zxid: 0x77000083f4 >>> > > > >>>>>>>>>>> Mode: leader >>> > > > >>>>>>>>>>> Node count: 23 >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> That 6 second maximum latency is pretty bad but should >>> be >>> > well >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> within >>> > > > >>>>>> >>> > > > >>>>>>> the >>> > > > >>>>>>>> >>> > > > >>>>>>>>> zk >>> > > > >>>>>>>>>> session timeout. >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> So, problem is likely on the zk client side of the >>> session; >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> i.e. >>> > > > >>> >>> > > > >>>> in >>> > > > >>>>>> >>> > > > >>>>>>> the >>> > > > >>>>>>> >>> > > > >>>>>>>> RS. >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> You could enable GC logging as J-D suggested to see if >>> you >>> > > have >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> any >>> > > > >>>>>> >>> > > > >>>>>>> big >>> > > > >>>>>>> >>> > > > >>>>>>>> pauses, pauses > zk session timeout. >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> When the RS went down, it didn't look too heavily >>> loaded: >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> 1. 2009-11-12 15:04:52,830 INFO >>> > > > >>>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: >>> Dump of >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> metrics: >>> > > > >>>>>>> >>> > > > >>>>>>>> request=1.5166667, regions=322, stores=657, >>> storefiles=631, >>> > > > >>>>>>>>>> storefileIndexSize=61, memstoreSize=1472, >>> usedHeap=2819, >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> maxHeap=4079, >>> > > > >>>>>>>> >>> > > > >>>>>>>>> blockCacheSize=658110960, blockCacheFree=197395984, >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> blockCacheCount=9903, >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> blockCacheHitRatio=99 >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> Its serving a few reads? The number of store files >>> seems >>> > > fine. >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> Not >>> > > > >>>>>> >>> > > > >>>>>>> too >>> > > > >>>>>>>> >>> > > > >>>>>>>>> much memory used. >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> Looking at the logs, I see the Lease Still Held >>> exception. >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> This >>> > > > >>> >>> > > > >>>> happens >>> > > > >>>>>>>> >>> > > > >>>>>>>>> when the RS does its regular report to the master but the >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> master >>> > > > >>> >>> > > > >>>> thinks >>> > > > >>>>>>> >>> > > > >>>>>>>> the >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> RS has since restarted. It'll think this probably >>> because >>> > it >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> noticed >>> > > > >>>>>> >>> > > > >>>>>>> that >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> the RS's znode in zk had gone away and it considered the >>> RS >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> dead. >>> > > > >>> >>> > > > >>>> >>> > > > >>>>>>>>>> Looking too at your logs I see this gap in the zk >>> pinging: >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> 1. 2009-11-12 15:03:39,325 DEBUG >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> org.apache.zookeeper.ClientCnxn: >>> > > > >>>>>> >>> > > > >>>>>>> Got >>> > > > >>>>>>> >>> > > > >>>>>>>> ping response for sessionid:0x224e55436ad0004 after 0ms >>> > > > >>>>>>>>>> 2. 2009-11-12 15:03:43,113 DEBUG >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> org.apache.zookeeper.ClientCnxn: >>> > > > >>>>>> >>> > > > >>>>>>> Got >>> > > > >>>>>>> >>> > > > >>>>>>>> ping response for sessionid:0x24e55436a0007d after 0ms >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> Where in the lines above it, its reporting about every >>> ten >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> seconds, >>> > > > >>>>>> >>> > > > >>>>>>> here >>> > > > >>>>>>>> >>> > > > >>>>>>>>> there is a big gap. >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> Do you have ganglia or something that will let you look >>> more >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> into >>> > > > >>> >>> > > > >>>> what >>> > > > >>>>>>> >>> > > > >>>>>>>> was >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> happening on this machine around this time? Is the >>> machine >>> > > OK? >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> It >>> > > > >>>>>> >>> > > > >>>>>>> looks >>> > > > >>>>>>>> >>> > > > >>>>>>>>> lightly loaded and you have your cluster nicely laid out. >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> Something >>> > > > >>>>>> >>> > > > >>>>>>> odd >>> > > > >>>>>>>> >>> > > > >>>>>>>>> is >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> going on. What about things like the write speed to >>> disk? >>> > In >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> the >>> > > > >>> >>> > > > >>>> past >>> > > > >>>>>>> >>> > > > >>>>>>>> strange issues have been explained by incorrectly set BIOS >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>> which >>> > > > >>> >>> > > > >>>> made >>> > > > >>>>>> >>> > > > >>>>>>> disks >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> run at 1/100th of their proper speed. >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> St.Ack >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>>> Best, >>> > > > >>>>>>>>>>> zhenyu >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>>> On Wed, Nov 11, 2009 at 3:58 PM, Zhenyu Zhong < >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>> [email protected] >>> > > > >>>>>>>> >>> > > > >>>>>>>>> wrote: >>> > > > >>>>>>>>>>>> >>> > > > >>>>>>>>>>> >>> > > > >>>>>>>>>>> Stack >>> > > > >>>>>>>>>>>> >>> > > > >>>>>>>>>>>> I am very appreciated for your comments. >>> > > > >>>>>>>>>>>> I will use the zookeeper monitoring script on my >>> cluster >>> > > > >>>>>>>>>>>> >>> > > > >>>>>>>>>>> and >>> > > > >>> >>> > > > >>>> let >>> > > > >>>>>> >>> > > > >>>>>>> it >>> > > > >>>>>>> >>> > > > >>>>>>>> run >>> > > > >>>>>>>>> >>> > > > >>>>>>>>>> overnight to see the result. >>> > > > >>>>>>>>>>>> I will follow up the thread when I get anything. >>> > > > >>>>>>>>>>>> >>> > > > >>>>>>>>>>>> thanks >>> > > > >>>>>>>>>>>> zhenyu >>> > > > >>>>>>>>>>>> >>> > > > >>>>>>>>>>>> >>> > > > >>>>>>>>>>>> On Wed, Nov 11, 2009 at 3:52 PM, stack < >>> [email protected]> >>> > > > >>>>>>>>>>>> >>> > > > >>>>>>>>>>> wrote: >>> > > > >>>>>> >>> > > > >>>>>>> >>> > > > >>>>>>>>>>>> I see these in your log too: >>> > > > >>>>>>>>>>>>> >>> > > > >>>>>>>>>>>>> >>> > > > >>>>>>>>>>>>> 1. 2009-11-11 04:27:19,018 DEBUG >>> > > > >>>>>>>>>>>>> >>> > > > >>>>>>>>>>>> org.apache.zookeeper.ClientCnxn: >>> > > > >>>>>>>> >>> > > > >>>>>>>>> Got >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>>> ping response for sessionid:0x424dfd908c50009 after >>> > > > >>>>>>>>>>>>> >>> > > > >>>>>>>>>>>> 4544ms >>> > > > >>> >>> > > > >>>> 2. 2009-11-11 04:27:19,018 DEBUG >>> > > > >>>>>>>>>>>>> >>> > > > >>>>>>>>>>>> org.apache.zookeeper.ClientCnxn: >>> > > > >>>>>>>> >>> > > > >>>>>>>>> Got >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>>> ping response for sessionid:0x24dfd90c810002 after >>> > > > >>>>>>>>>>>>> >>> > > > >>>>>>>>>>>> 4548ms >>> > > > >>> >>> > > > >>>> 3. 2009-11-11 04:27:43,960 DEBUG >>> > > > >>>>>>>>>>>>> >>> > > > >>>>>>>>>>>> org.apache.zookeeper.ClientCnxn: >>> > > > >>>>>>>> >>> > > > >>>>>>>>> Got >>> > > > >>>>>>>>>> >>> > > > >>>>>>>>>>> ping response for sessionid:0x424dfd908c50009 after >>> > > > >>>>>>>>>>>>> >>> > > > >>>>>>>>>>>> 9030ms >>> > > > >>> >>> > > > >>>> 4. 2009-11-11 04:27:43,960 DEBUG >>> > > > >>>>>>>>>>>>> >>> > > > >>>>>>>>>>>> org.apache.zookeeper. >>> > > > >>>>>>>> >>> > > > >>>>>>> >>> > > > >>> > > >>> > >>> >> >> >
