Could you paste a bigger log trace (interested to see at least all 300 lines before that) in a pastebin?
Thx, J-D On Wed, Nov 11, 2009 at 10:56 AM, Zhenyu Zhong <[email protected]> wrote: > J-D, > > Thanks for the reply. > > The problem I have is one or more of the regionservers disconnect from the > cluster. > > I checked the resource usage including memory Heap usage for regionserver, > and found nothing suspicious. The regionserver heap usage is way below the > maxHeap defined in jvm. > > So I checked zookeeper log, then I found the following > BufferUnderflowException, eventually it leads to SocketException: Transport > endpoint is not connected. > > > I also checked the regionserver log. It looks like the regionserver tried to > connect to other zookeeper quorum server. It connects successfully, but then > it complained about SocketException: transport endpoint is not connected. > > So both zookeeper log and regionserver log told me that the transport > endpoint is not connected. I don't know why. > > I would be very appreciated if you could point me to the right direction. > > Best, > > > Regionserver log: > 2009-11-11 04:24:53,718 INFO org.apache.zookeeper.ClientCnxn: Attempting > connection to server superpyxis0018/192.168.100.132:2181 > 2009-11-11 04:24:53,718 INFO org.apache.zookeeper.ClientCnxn: Priming > connection to java.nio.channels.SocketChannel[connected local=/ > 192.168.100.131:43065 remote=superpyxis0018/192.168.100.132:2181] > 2009-11-11 04:24:53,718 INFO org.apache.zookeeper.ClientCnxn: Server > connection successful > 2009-11-11 04:24:53,768 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, > state: Disconnected, type: None, path: null > 2009-11-11 04:24:54,118 INFO org.apache.zookeeper.ClientCnxn: Attempting > connection to server superpyxis0017/192.168.100.131:2181 > 2009-11-11 04:24:54,118 INFO org.apache.zookeeper.ClientCnxn: Priming > connection to java.nio.channels.SocketChannel[connected local=/ > 192.168.100.131:46903 remote=superpyxis0017/192.168.100.131:2181] > 2009-11-11 04:24:54,118 INFO org.apache.zookeeper.ClientCnxn: Server > connection successful > 2009-11-11 04:24:54,118 WARN org.apache.zookeeper.ClientCnxn: Exception > closing session 0x24dfd90c810002 to sun.nio.ch.selectionkeyi...@2dd78ccf > java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0 > lim=4 cap=4] > at > org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:701) > at > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945) > 2009-11-11 04:24:54,118 WARN org.apache.zookeeper.ClientCnxn: Ignoring > exception during shutdown input > java.net.SocketException: Transport endpoint is not connected > at sun.nio.ch.SocketChannelImpl.shutdown(Native Method) > at > sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640) > at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) > at > org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999) > at > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970) > > > > > > On Wed, Nov 11, 2009 at 1:40 PM, Jean-Daniel Cryans > <[email protected]>wrote: > >> I'm really not sure exactly what the situation is with the >> java.nio.BufferUnderflowException, but what do you mean by >> "regionserver disconnection"? Could you give more details? >> >> Thx, >> >> J-D >> >> On Wed, Nov 11, 2009 at 10:34 AM, Zhenyu Zhong <[email protected]> >> wrote: >> > I searched around and found HBASE-1645 "HBase could be a little more >> dainty >> > when zk is pulled out from under it" >> > >> > I am wondering whether my problem is related to this HBase-1645. Or >> probably >> > has nothing to do with it? >> > >> > thanks >> > >> > >> > On Wed, Nov 11, 2009 at 12:02 PM, Zhenyu Zhong <[email protected] >> >wrote: >> > >> >> Hi, >> >> >> >> I have been monitoring the system resources such as memory Heap usage >> for >> >> Regionserver, load average, network bandwidth, etc. >> >> >> >> I experienced a disconnection last night. When I checked the memory Heap >> >> for Regionserver, it was 2336 which is way below the max Heap 4096. I >> don't >> >> think it is Regionserver problem. The requests that the regionserver got >> was >> >> about 126 which doesn't seem to be high. The load average is kind of >> high >> >> >100. (but I saw disconnections while the load average is around 30). >> The >> >> bandwidth is not a problem. >> >> >> >> However, I found error messages from the zookeeper. Please see the >> >> following error messages. There are some >> >> BufferUnderflowException error, IOException error,etc. >> >> May I ask what might be the causes that generate these error messages? >> >> >> >> Many thanks >> >> zhenyu >> >> >> >> >> >> 2009-11-11 04:29:00,810 WARN org.apache.zookeeper.server.Request: >> Ignoring >> >> exception during toString >> >> java.nio.BufferUnderflowException >> >> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:127) >> >> at java.nio.ByteBuffer.get(ByteBuffer.java:675) >> >> at >> org.apache.zookeeper.server.Request.toString(Request.java:199) >> >> at java.lang.String.valueOf(String.java:2827) >> >> at java.lang.StringBuilder.append(StringBuilder.java:115) >> >> at >> >> >> org.apache.zookeeper.server.quorum.CommitProcessor.processRequest(CommitProcessor.java:167) >> >> at >> >> >> org.apache.zookeeper.server.quorum.FollowerRequestProcessor.run(FollowerRequestProcessor.java:68) >> >> 2009-11-11 04:29:00,810 WARN org.apache.zookeeper.server.NIOServerCnxn: >> >> Exception causing close of session 0x424dfd908c50009 >> >> due to java.io.IOException: Read error >> >> 2009-11-11 04:29:00,810 DEBUG org.apache.zookeeper.server.NIOServerCnxn: >> >> IOException stack trace >> >> java.io.IOException: Read error >> >> at >> >> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:372) >> >> at >> >> >> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:221) >> >> 2009-11-11 04:29:00,810 DEBUG >> >> org.apache.zookeeper.server.quorum.CommitProcessor: Processing request:: >> >> sessionid:0x424dfd908c50009 type:setWatches cxid:0xfffffffffffffff8 >> >> zxid:0xfffffffffffffffe txntype:unknown n/a >> >> 2009-11-11 04:29:00,810 WARN org.apache.zookeeper.server.Request: >> Ignoring >> >> exception during toString >> >> java.nio.BufferUnderflowException >> >> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:127) >> >> at java.nio.ByteBuffer.get(ByteBuffer.java:675) >> >> at >> org.apache.zookeeper.server.Request.toString(Request.java:199) >> >> at java.lang.String.valueOf(String.java:2827) >> >> at java.lang.StringBuilder.append(StringBuilder.java:115) >> >> at >> >> >> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:74) >> >> at >> >> >> org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73) >> >> 2009-11-11 04:29:00,810 DEBUG >> >> org.apache.zookeeper.server.FinalRequestProcessor: Processing request:: >> >> sessionid:0x424dfd908c50009 type:setWatches cxid:0xfffffffffffffff8 >> >> zxid:0xfffffffffffffffe txntype:unknown n/a >> >> 2009-11-11 04:29:00,810 WARN org.apache.zookeeper.server.Request: >> Ignoring >> >> exception during toString >> >> java.nio.BufferUnderflowException >> >> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:127) >> >> at java.nio.ByteBuffer.get(ByteBuffer.java:675) >> >> at >> org.apache.zookeeper.server.Request.toString(Request.java:199) >> >> at >> >> org.apache.log4j.or.DefaultRenderer.doRender(DefaultRenderer.java:36) >> >> at >> >> org.apache.log4j.or.RendererMap.findAndRender(RendererMap.java:80) >> >> at >> >> >> org.apache.log4j.spi.LoggingEvent.getRenderedMessage(LoggingEvent.java:362) >> >> at >> >> >> org.apache.log4j.helpers.PatternParser$BasicPatternConverter.convert(PatternParser.java:403) >> >> at >> >> >> org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:65) >> >> at org.apache.log4j.PatternLayout.format(PatternLayout.java:502) >> >> at >> >> org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:302) >> >> at >> >> >> org.apache.log4j.DailyRollingFileAppender.subAppend(DailyRollingFileAppender.java:359) >> >> at >> org.apache.log4j.WriterAppender.append(WriterAppender.java:160) >> >> at >> >> org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) >> >> at >> >> >> org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) >> >> at org.apache.log4j.Category.callAppenders(Category.java:206) >> >> at org.apache.log4j.Category.forcedLog(Category.java:391) >> >> at org.apache.log4j.Category.debug(Category.java:260) >> >> at >> >> >> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:147) >> >> at >> >> >> org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73) >> >> 2009-11-11 04:29:00,810 WARN org.apache.zookeeper.server.Request: >> Ignoring >> >> exception during toString >> >> sessionid:0x424dfd908c50009 type:setWatches cxid:0xfffffffffffffff8 >> >> zxid:0xfffffffffffffffe txntype:unknown n/a >> >> 2009-11-11 04:29:00,810 INFO org.apache.zookeeper.server.NIOServerCnxn: >> >> closing session:0x424dfd908c50009 NIOServerCnxn: >> >> java.nio.channels.SocketChannel[connected local=/192.168.100.131:2181 >> remote=/ >> >> 192.168.100.131:35500] >> >> 2009-11-11 04:29:00,810 DEBUG org.apache.zookeeper.server.NIOServerCnxn: >> >> ignoring exception during input shutdown >> >> java.net.SocketException: Transport endpoint is not connected >> >> at sun.nio.ch.SocketChannelImpl.shutdown(Native Method) >> >> at >> >> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640) >> >> at >> sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) >> >> at >> >> org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:848) >> >> at >> >> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:498) >> >> at >> >> >> org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:221) >> >> >> > >> >
