First of all thanks a lot for coming forward with helping hand.

Here my answers along with the question you asked



How many zookeeper servers do you have ? Or what is the number of clients
you have running per host

Ans: I have only one linux box which is only one node system.

Basically in a single system I have installed Hbase.



what is the configured value of maxClientCnxns in the ZooKeeper servers?

Ans: We are using the default configuration. We have not introduced any new
value in hbase-site.xml



Is the issue impacting clients only or is it also impacting the
RegionServers

Ans: In this case all regional server, master node, client is same. Because
we have installed hbase in a single system


Have you looked into why the ZooKeeper server is no longer accepting
connections

Ans: Now I checked logs of hbase just at the moment my application broke
for me it l*ooked like JVM went for Garbage collection after that it newer
came back.* *Which resulted in exception.Is my interpretation correct.
kindly let me know *

Here is the complete log

2015-06-01 19:59:53,808 INFO  [pool-55-thread-1] master.HMaster: Master has
completed initialization

2015-06-01 19:59:53,808 INFO  [main-EventThread] zookeeper.ClientCnxn:
EventThread shut down

2015-06-01 20:00:46,431 INFO  [JvmPauseMonitor] util.JvmPauseMonitor:
Detected pause in JVM or host machine (eg GC): pause of approximately 6885ms

GC pool 'ParNew' had collection(s): count=1 time=7383ms

2015-06-01 20:00:46,431 INFO  [JvmPauseMonitor] util.JvmPauseMonitor:
Detected pause in JVM or host machine (eg GC): pause of approximately 6886ms

GC pool 'ParNew' had collection(s): count=1 time=7383ms

2015-06-01 20:00:47,032 WARN  [M:0;hadoop2:35923.oldLogCleaner]
cleaner.CleanerChore: A file cleanerM:0;hadoop2:35923.oldLogCleaner is
stopped, won't delete any more files
in:file:/home/hadoop/hbaseDataDir/oldWALs

2015-06-01 20:02:05,148 WARN  [M:0;hadoop2:35923.oldLogCleaner]
util.Sleeper: We slept 78116ms instead of 60000ms, this is likely due to a
long garbage collecting pause and it's usually bad, see
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired

2015-06-01 20:02:05,148 WARN  [M:0;hadoop2:35923.archivedHFileCleaner]
util.Sleeper: We slept 78122ms instead of 60000ms, this is likely due to a
long garbage collecting pause and it's usually bad, see
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired

2015-06-01 20:02:05,149 WARN
[hadoop2,35923,1432909409923-ClusterStatusChore] util.Sleeper: We slept
78128ms instead of 60000ms, this is likely due to a long garbage collecting
pause and it's usually bad, see
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired

2015-06-01 20:02:05,149 WARN  [RS:0;hadoop2:40129] util.Sleeper: We slept
39687ms instead of 3000ms, this is likely due to a long garbage collecting
pause and it's usually bad, see
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired

2015-06-01 20:02:05,151 WARN  [JvmPauseMonitor] util.JvmPauseMonitor:
Detected pause in JVM or host machine (eg GC): pause of approximately
39206ms

GC pool 'ParNew' had collection(s): count=1 time=39328ms

2015-06-01 20:02:05,151 WARN  [M:0;hadoop2:35923] util.Sleeper: We slept
39345ms instead of 100ms, this is likely due to a long garbage collecting
pause and it's usually bad, see
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired

2015-06-01 20:02:05,151 WARN  [JvmPauseMonitor] util.JvmPauseMonitor:
Detected pause in JVM or host machine (eg GC): pause of approximately
39205ms

GC pool 'ParNew' had collection(s): count=1 time=39328ms

2015-06-01 20:02:05,151 INFO  [SessionTracker] server.ZooKeeperServer:
Expiring session 0x14da00e69e00001, timeout of 40000ms exceeded

2015-06-01 20:02:05,151 INFO  [RS:0;hadoop2:40129-SendThread(hadoop2:2181)]
zookeeper.ClientCnxn: Client session timed out, have not heard from server
in 52055ms for sessionid 0x14da00e69e00001, closing socket connection and
attempting reconnect

2015-06-01 20:02:05,151 INFO  [RS:0;hadoop2:40129-SendThread(hadoop2:2181)]
zookeeper.ClientCnxn: Client session timed out, have not heard from server
in 52053ms for sessionid 0x14da00e69e00004, closing socket connection and
attempting reconnect

2015-06-01 20:02:05,151 WARN
[hadoop2,35923,1432909409923.splitLogManagerTimeoutMonitor] util.Sleeper:
We slept 39713ms instead of 1000ms, this is likely due to a long garbage
collecting pause and it's usually bad, see
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired

2015-06-01 20:02:05,155 WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181]
server.NIOServerCnxn: caught end of stream exception

EndOfStreamException: Unable to read additional data from client sessionid
0x14da00e69e00001, likely client has closed socket

          at
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)

          at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)

          at java.lang.Thread.run(Thread.java:745)





On Tue, Jun 2, 2015 at 12:45 AM, jeevi tesh <[email protected]> wrote:

> Hi,
> I'm running into this issue several times but still not able resolve
> kindly help me in this regard.
> I have written a crawler which will be keep running for several days after
> 4 days of continuous interaction of data base with my application system.
> Data base fails to responsed. I'm not able to figure where things all of a
> sudden can go wrong after 4 days of proper running.
> My configuration i have used hbase 0.96.2 single server.
> jdk 1.7
>
> issue is this following error
> WARN  [http-bio-8080-exec-4-SendThread(hadoop2:2181)] zookeeper.ClientCnxn
> (ClientCnxn.java:run(1089)) - Session 0x14da00e69e001ad for server null,
> unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> If this exception happens only solution i have is restart hbase that is
> not a viable solution because that will corrupt my system data.
>

Reply via email to