How much heap did you give the region server ? How much total memory does the box have ?
I guess you have read http://hbase.apache.org/book.html#jvm If you're using jdk 1.7.0_60 or newer, you can consider using G1GC. Cheers On Tue, Jun 2, 2015 at 3:26 AM, jeevi tesh <[email protected]> wrote: > First of all thanks a lot for coming forward with helping hand. > > Here my answers along with the question you asked > > > > How many zookeeper servers do you have ? Or what is the number of clients > you have running per host > > Ans: I have only one linux box which is only one node system. > > Basically in a single system I have installed Hbase. > > > > what is the configured value of maxClientCnxns in the ZooKeeper servers? > > Ans: We are using the default configuration. We have not introduced any new > value in hbase-site.xml > > > > Is the issue impacting clients only or is it also impacting the > RegionServers > > Ans: In this case all regional server, master node, client is same. Because > we have installed hbase in a single system > > > Have you looked into why the ZooKeeper server is no longer accepting > connections > > Ans: Now I checked logs of hbase just at the moment my application broke > for me it l*ooked like JVM went for Garbage collection after that it newer > came back.* *Which resulted in exception.Is my interpretation correct. > kindly let me know * > > Here is the complete log > > 2015-06-01 19:59:53,808 INFO [pool-55-thread-1] master.HMaster: Master has > completed initialization > > 2015-06-01 19:59:53,808 INFO [main-EventThread] zookeeper.ClientCnxn: > EventThread shut down > > 2015-06-01 20:00:46,431 INFO [JvmPauseMonitor] util.JvmPauseMonitor: > Detected pause in JVM or host machine (eg GC): pause of approximately > 6885ms > > GC pool 'ParNew' had collection(s): count=1 time=7383ms > > 2015-06-01 20:00:46,431 INFO [JvmPauseMonitor] util.JvmPauseMonitor: > Detected pause in JVM or host machine (eg GC): pause of approximately > 6886ms > > GC pool 'ParNew' had collection(s): count=1 time=7383ms > > 2015-06-01 20:00:47,032 WARN [M:0;hadoop2:35923.oldLogCleaner] > cleaner.CleanerChore: A file cleanerM:0;hadoop2:35923.oldLogCleaner is > stopped, won't delete any more files > in:file:/home/hadoop/hbaseDataDir/oldWALs > > 2015-06-01 20:02:05,148 WARN [M:0;hadoop2:35923.oldLogCleaner] > util.Sleeper: We slept 78116ms instead of 60000ms, this is likely due to a > long garbage collecting pause and it's usually bad, see > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > 2015-06-01 20:02:05,148 WARN [M:0;hadoop2:35923.archivedHFileCleaner] > util.Sleeper: We slept 78122ms instead of 60000ms, this is likely due to a > long garbage collecting pause and it's usually bad, see > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > 2015-06-01 20:02:05,149 WARN > [hadoop2,35923,1432909409923-ClusterStatusChore] util.Sleeper: We slept > 78128ms instead of 60000ms, this is likely due to a long garbage collecting > pause and it's usually bad, see > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > 2015-06-01 20:02:05,149 WARN [RS:0;hadoop2:40129] util.Sleeper: We slept > 39687ms instead of 3000ms, this is likely due to a long garbage collecting > pause and it's usually bad, see > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > 2015-06-01 20:02:05,151 WARN [JvmPauseMonitor] util.JvmPauseMonitor: > Detected pause in JVM or host machine (eg GC): pause of approximately > 39206ms > > GC pool 'ParNew' had collection(s): count=1 time=39328ms > > 2015-06-01 20:02:05,151 WARN [M:0;hadoop2:35923] util.Sleeper: We slept > 39345ms instead of 100ms, this is likely due to a long garbage collecting > pause and it's usually bad, see > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > 2015-06-01 20:02:05,151 WARN [JvmPauseMonitor] util.JvmPauseMonitor: > Detected pause in JVM or host machine (eg GC): pause of approximately > 39205ms > > GC pool 'ParNew' had collection(s): count=1 time=39328ms > > 2015-06-01 20:02:05,151 INFO [SessionTracker] server.ZooKeeperServer: > Expiring session 0x14da00e69e00001, timeout of 40000ms exceeded > > 2015-06-01 20:02:05,151 INFO [RS:0;hadoop2:40129-SendThread(hadoop2:2181)] > zookeeper.ClientCnxn: Client session timed out, have not heard from server > in 52055ms for sessionid 0x14da00e69e00001, closing socket connection and > attempting reconnect > > 2015-06-01 20:02:05,151 INFO [RS:0;hadoop2:40129-SendThread(hadoop2:2181)] > zookeeper.ClientCnxn: Client session timed out, have not heard from server > in 52053ms for sessionid 0x14da00e69e00004, closing socket connection and > attempting reconnect > > 2015-06-01 20:02:05,151 WARN > [hadoop2,35923,1432909409923.splitLogManagerTimeoutMonitor] util.Sleeper: > We slept 39713ms instead of 1000ms, this is likely due to a long garbage > collecting pause and it's usually bad, see > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired > > 2015-06-01 20:02:05,155 WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] > server.NIOServerCnxn: caught end of stream exception > > EndOfStreamException: Unable to read additional data from client sessionid > 0x14da00e69e00001, likely client has closed socket > > at > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220) > > at > > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) > > at java.lang.Thread.run(Thread.java:745) > > > > > > On Tue, Jun 2, 2015 at 12:45 AM, jeevi tesh <[email protected]> > wrote: > > > Hi, > > I'm running into this issue several times but still not able resolve > > kindly help me in this regard. > > I have written a crawler which will be keep running for several days > after > > 4 days of continuous interaction of data base with my application system. > > Data base fails to responsed. I'm not able to figure where things all of > a > > sudden can go wrong after 4 days of proper running. > > My configuration i have used hbase 0.96.2 single server. > > jdk 1.7 > > > > issue is this following error > > WARN [http-bio-8080-exec-4-SendThread(hadoop2:2181)] > zookeeper.ClientCnxn > > (ClientCnxn.java:run(1089)) - Session 0x14da00e69e001ad for server null, > > unexpected error, closing socket connection and attempting reconnect > > java.net.ConnectException: Connection refused > > If this exception happens only solution i have is restart hbase that is > > not a viable solution because that will corrupt my system data. > > >
