[ 
https://issues.apache.org/jira/browse/CONNECTORS-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135366#comment-14135366
 ] 

Karl Wright commented on CONNECTORS-1031:
-----------------------------------------

The stack trace from agents when Zookeeper is in this state is identical to 
that which Erlend has reported.

The reasons for the connections going down seem specifically to be due to 
general latency of some form or another:

https://github.com/pongasoft/glu/issues/189

It is possible that we could prevent this from happening as often with certain 
specific changes; e.g. a longer tick count or timeout.  However, the logs also 
contain the following:

{code}
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.NIOServerCnxnFactory - Accepted socket connection 
from /0:0:0:0:0:0:0:1:59735
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.ZooKeeperServer - Client attempting to renew 
session 0x1487e42da2d0008 at /0:0:0:0:0:0:0:1:59735
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.ZooKeeperServer - Established session 
0x1487e42da2d0008 with negotiated timeout 4000 for client /0:0:0:0:0:0:0:1:59735
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.NIOServerCnxnFactory - Accepted socket connection 
from /127.0.0.1:59736
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.ZooKeeperServer - Client attempting to renew 
session 0x1487e42da2d0020 at /127.0.0.1:59736
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.ZooKeeperServer - Established session 
0x1487e42da2d0020 with negotiated timeout 4000 for client /127.0.0.1:59736
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.NIOServerCnxnFactory - Accepted socket connection 
from /127.0.0.1:59737
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.ZooKeeperServer - Client attempting to renew 
session 0x1487e42da2d0006 at /127.0.0.1:59737
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.ZooKeeperServer - Established session 
0x1487e42da2d0006 with negotiated timeout 4000 for client /127.0.0.1:59737
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.NIOServerCnxnFactory - Accepted socket connection 
from /127.0.0.1:59738
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.ZooKeeperServer - Client attempting to renew 
session 0x1487e42da2d004c at /127.0.0.1:59738
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.ZooKeeperServer - Established session 
0x1487e42da2d004c with negotiated timeout 4000 for client /127.0.0.1:59738
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.NIOServerCnxnFactory - Accepted socket connection 
from /127.0.0.1:59739
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.ZooKeeperServer - Client attempting to renew 
session 0x1487e42da2d0005 at /127.0.0.1:59739
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:8349] INFO 
org.apache.zookeeper.server.ZooKeeperServer - Established session 
0x1487e42da2d0005 with negotiated timeout 4000 for client /127.0.0.1:59739
{code}

This indicates to me that connections are getting re-established, but the 
subsequent hangs seem to indicate that connection re-establishment is not 
enough to reset things completely.




> Update zookeeper parameters so that example server is stable for the long term
> ------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-1031
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1031
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Framework core
>    Affects Versions: ManifoldCF 1.7
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 2.0
>
>
> The zookeeper parameters we deliver are missing apparently important limits 
> on growth:
> autopurge.snapRetainCount=3 : default value
> autopurge.purgeInterval=1: default value



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to