[ 
https://issues.apache.org/jira/browse/CONNECTORS-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13908463#comment-13908463
 ] 

Graeme Seaton commented on CONNECTORS-898:
------------------------------------------

Karl,

Quickly tested today (will thrash and bash more over then next few days) and it 
all works perfectly so far.  Excellent.

ZK connections had already been set to 500 as that was one of the first issues 
I hit when deploying on our new cluster.

Do you anticipate any more schema changes as part of this cycle?

> Agents fail to start if ZK ensemble member missing
> --------------------------------------------------
>
>                 Key: CONNECTORS-898
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-898
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Framework agents process
>    Affects Versions: ManifoldCF 1.5
>         Environment: 4 Agents
> 3 member ZK ensemble (2 live, 1 dead)
>            Reporter: Graeme Seaton
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.6
>
>
> If a member of the ZK ensemble is down but there is still a majority of 
> members active so that ZK is 'live' then when the agents startup any agents 
> that try to connect to the missing member abort with:
> Opening socket connection to server overlorddev03/10.250.0.36:2181. Will not 
> att
> empt to authenticate using SASL (unknown error)
> 71 [main-SendThread(overlorddev03:2181)] WARN org.apache.zookeeper.ClientCnxn 
> - 
> Session 0x0 for server null, unexpected error, closing socket connection and 
> att
> empting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735
> )
>         at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocket
> NIO.java:350)
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
> followed by:
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Initialization 
> failed: KeeperErrorCode = ConnectionLoss for 
> /org.apache.manifoldcf.configuration
>         at 
> org.apache.manifoldcf.core.system.ManifoldCF.initializeEnvironment(ManifoldCF.java:269)
>         at 
> org.apache.manifoldcf.agents.system.ManifoldCF.initializeEnvironment(ManifoldCF.java:43)
>         at 
> org.apache.manifoldcf.agents.BaseAgentsInitializationCommand.execute(BaseAgentsInitializationCommand.java:36)
>         at org.apache.manifoldcf.agents.AgentRun.main(AgentRun.java:93)
> This has a knock affect to the other agents which then eventually fail with 
> 'agents process could not start - shutting down'.  
> Besides exceptions of this type:
> 5401 [main-SendThread(overlorddev03:2181)] INFO 
> org.apache.zookeeper.ClientCnxn 
> - Opening socket connection to server overlorddev03/10.250.0.36:2181. Will 
> not a
> ttempt to authenticate using SASL (unknown error)
> 5403 [main-SendThread(overlorddev03:2181)] WARN 
> org.apache.zookeeper.ClientCnxn 
> - Session 0x0 for server null, unexpected error, closing socket connection 
> and a
> ttempting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735
> )
>         at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocket
> NIO.java:350)
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
> 5506 [main-SendThread(overlorddev04:2181)] INFO 
> org.apache.zookeeper.ClientCnxn - Opening socket connection to server 
> overlorddev04/10.250.0.46:2181. Will not attempt to authenticate using SASL 
> (unknown error)
> 5507 [main-SendThread(overlorddev04:2181)] INFO 
> org.apache.zookeeper.ClientCnxn - Socket connection established to 
> overlorddev04/10.250.0.46:2181, initiating session
> the only other notable exception is:
> 5509 [main-SendThread(overlorddev04:2181)] INFO 
> org.apache.zookeeper.ClientCnxn 
> - Session establishment complete on server overlorddev04/10.250.0.46:2181, 
> sessi
> onid = 0x4444f2cb0590087, negotiated timeout = 8000
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: KeeperErrorCode = 
> Con
> nectionLoss for /org.apache.manifoldcf.flags-_AGENTRUN_
>         at 
> org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.checkGlobalFlag(ZooKeeperConnection.java:499)
>         at 
> org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager.checkGlobalFlag(ZooKeeperLockManager.java:787)
>         at 
> org.apache.manifoldcf.agents.system.AgentsDaemon.runAgents(AgentsDaemon.java:110)
>         at org.apache.manifoldcf.agents.AgentRun.doExecute(AgentRun.java:64)
>         at 
> org.apache.manifoldcf.agents.BaseAgentsInitializationCommand.execute(BaseAgentsInitializationCommand.java:37)
>         at org.apache.manifoldcf.agents.AgentRun.main(AgentRun.java:93)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to