Hi Lalit, Zookeeper will keep working, but you should understand that you are dropping connections to your zookeeper members for unknown reasons, which is causing your crawl to stall when it happens. This argues that perhaps you have some network flakiness of some kind.
Karl On Mon, Sep 15, 2014 at 8:59 AM, lalit jangra <[email protected]> wrote: > > Hi, > > I am running cluster of two Apache ManifoldCF nodes on two separate > machines each of which having 3 zookeeper instances (total 6 instances in > cluster). When i am running up manifoldCF agents, i see below warning > during startup. > > [http-bio-80-exec-2-SendThread(iwdc1preecma03.iwater.ie:2181)] INFO > org.apache.zookeeper.ClientCnxn - Unable to read additional data from > server sessionid 0x0, likely server has closed socket, closing socket > connection and attempting reconnect > > [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO > org.apache.zookeeper.ClientCnxn - Opening socket connection to server > iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt to > authenticate using SASL (unknown error) > > > Also i could see below error in logs in while agents are running. > > [http-bio-80-exec-2] INFO org.apache.zookeeper.ZooKeeper - Initiating > client connection, > connectString=iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183 > sessionTimeout=4000 > watcher=org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection$ZooKeeperWatcher@51d83fd7 > > [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO > org.apache.zookeeper.ClientCnxn - Opening socket connection to server > iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt to > authenticate using SASL (unknown error) > > [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO > org.apache.zookeeper.ClientCnxn - Socket connection established to > iwdc2preecma04.iwater.ie/10.231.72.25:2182, initiating session > > [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] WARN > org.apache.zookeeper.ClientCnxn - Session 0x0 for server > iwdc2preecma04.iwater.ie/10.231.72.25:2182, unexpected error, closing > socket connection and attempting reconnect > > java.io.IOException: Connection reset by peer > > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225) > > at sun.nio.ch.IOUtil.read(IOUtil.java:193) > > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375) > > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68) > > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355) > > at > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) > > [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)] INFO > org.apache.zookeeper.ClientCnxn - Opening socket connection to server > iwdc2preecma04.iwater.ie/10.231.72.25:2183. Will not attempt to > authenticate using SASL (unknown error) > > [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)] INFO > org.apache.zookeeper.ClientCnxn - Socket connection established to > iwdc2preecma04.iwater.ie/10.231.72.25:2183, initiating session > > [http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2183)] INFO > org.apache.zookeeper.ClientCnxn - Session establishment complete on server > iwdc2preecma04.iwater.ie/10.231.72.25:2183, sessionid = > 0x6487851bd330078, negotiated timeout = 4000 > > > Below are configurations for 1. zookeeper nodes & 2. MCF nodes for > zookeeper. > > > *zoo.cfg : Same for all six zookeeper nodes.* > > > # The number of milliseconds of each tick > > tickTime=2000 > > dataDir=/app/IW/zookeeper/data/data.1 > > dataLogDir=/app/IW/zookeeper/logs/log.1 > > clientPort=2181 > > server.1=iwdc1preecma03:2888:3888 > > server.2=iwdc1preecma03:2889:3889 > > server.3=iwdc1preecma03:2890:3890 > > server.4=iwdc2preecma04:2891:3891 > > server.5=iwdc2preecma04:2892:3892 > > server.6=iwdc2preecma04:2893:3893 > > # The number of ticks that the initial > > # synchronization phase can take > > initLimit=10 > > # The number of ticks that can pass between > > # sending a request and getting an acknowledgement > > syncLimit=5 > > # the directory where the snapshot is stored. > > # do not use /tmp for storage, /tmp here is just > > # example sakes. > > #dataDir=/tmp/zookeeper > > # the port at which the clients will connect > > #clientPort=2181 > > # the maximum number of client connections. > > # increase this if you need to handle more clients > > #maxClientCnxns=60 > > # > > # Be sure to read the maintenance section of the > > # administrator guide before turning on autopurge. > > # > > # > http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance > > # > > # The number of snapshots to retain in dataDir > > autopurge.snapRetainCount=3 > > # Purge task interval in hours > > # Set to "0" to disable auto purge feature > > autopurge.purgeInterval=1 > > > > *ManifoldCF configurations : same for both ManifoldCF nodes.* > > > <property name="org.apache.manifoldcf.lockmanagerclass" > value="org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager"/> > > <property name="org.apache.manifoldcf.zookeeper.connectstring" > value="iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183"/> > > <property name="org.apache.manifoldcf.zookeeper.sessiontimeout" > value="4000"/> > > > > *I want to know if due to above warnings/errors, will zookeeper stop > working or will zookeeper will work and these are non-failing messages, > because ManifoldCF jobs are stuck while i can see these errors.* > > Please suggest. > > Regards, > Lalit. > >
