Hello Flavio, I am using 'zkServer.sh start' command to start zookeeper nodes. I also could see logs in log folders in have specified but these logs are in a form which is difficult to understand.
Also regarding to using 6 zookeeper nodes (3+3), is it fine to handle failures as per 50% rule as if 3 are down my cluster should work or should i move to having odd numbers such as 5 or 7 here? Regards. On Tue, Sep 16, 2014 at 4:26 AM, Flavio Junqueira < [email protected]> wrote: > Instead of guessing, I think it is best if we understand what's going > wrong with the servers, you need to look at the server logs. If you don't > know how to get it, could you please share the command you're using to > start servers? > > -Flavio > > > > On Monday, September 15, 2014 3:30 PM, lalit jangra < > [email protected]> wrote: > > > > > > > >Hello Flavio, > > > >Can this issue arise from system not having enough RAM for Java Heap as i > >could see my system is running on top of its RAM? > > > >Also is there any way to assign memory to zookeeper nodes? > > > >Regards. > > > >On Mon, Sep 15, 2014 at 7:37 PM, lalit jangra <[email protected]> > >wrote: > > > >> Thanks Flavio, > >> > >> I am having 3+3 zookeeper nodes on two servers MCF1 & MCF2. Also i could > >> see same error on both nodes. For logs into servers, i am not able to > read > >> anything from these, how can i read and interpret from zookeeper servers > >> what is wrong? > >> > >> I have put different log & data directories for each of zookeeper, may > be > >> i should elaborate a bit more. I am deciding on names of logs & data > >> directory as per myid (ranging from 1 to 6). > >> > >> ZK1 -> Data.1 -> Logs.1 > >> ZK2 -> Data.2 -> Logs.2 > >> ZK3 -> Data.3 -> Logs.3 > >> ZK4 -> Data.4 -> Logs.4 > >> ZK5 -> Data.5 -> Logs.5 > >> ZK6 -> Data.6 -> Logs.6 > >> > >> As i have two servers only and i need to make it running on these two > only > >> so i chose this architecture. Also i am trying to make even for scenario > >> where one node is down, i have only 3 zookeepers down so still second is > >> working. If i have odd numbers say 5 or 7, if server with more numbers > of > >> zookeeper is down, its gone. > >> > >> Regards. > >> > >> > >> On Mon, Sep 15, 2014 at 7:29 PM, Flavio Junqueira < > >> [email protected]> wrote: > >> > >>> I believe you have shared just the client-side errors, and I was > >>> wondering what's going on with the servers. One problem I could spot > with > >>> the configuration is with the values of dataDir and dataLogDir. It > looks > >>> like the processes on the same node are writing to the same directory, > >>> which should be confusing the servers. > >>> > >>> A couple of things about your setting. I'm not sure what your > motivation > >>> is to put multiple servers on the same node. It will induce correlated > >>> crashes for the servers on the same node. Also, we in general > recommend to > >>> use an odd number of servers (5 or 7 for your case). > >>> > >>> -Flavio > >>> > >>> On Wednesday, September 10, 2014 6:29 AM, lalit jangra < > >>> [email protected]> wrote: > >>> > >>> > >>> > > >>> > > >>> >Hi, > >>> > > >>> >I am running cluster of two Apache ManifoldCF nodes on two separate > >>> >machines each of which having 3 zookeeper instances (total 6 > instances in > >>> >cluster). When i am running up manifoldCF agents, i see below warning > >>> >during startup. > >>> > > >>> >[http-bio-80-exec-2-SendThread(iwdc1preecma03.iwater.ie:2181)] INFO > >>> >org.apache.zookeeper.ClientCnxn - Unable to read additional data from > >>> >server sessionid 0x0, likely server has closed socket, closing socket > >>> >connection and attempting reconnect > >>> > > >>> >[http-bio-80-exec-2-SendThread(iwdc2preecma04.iwater.ie:2182)] INFO > >>> >org.apache.zookeeper.ClientCnxn - Opening socket connection to server > >>> >iwdc2preecma04.iwater.ie/10.231.72.25:2182. Will not attempt to > >>> >authenticate using SASL (unknown error) > >>> > > >>> > > >>> >Also i could see below error in logs in while agents are running. > >>> > > >>> >[localhost-startStop-1-SendThread(iwdc1preecma03.iwater.ie:2183)] > WARN > >>> >org.apache.zookeeper.ClientCnxn - Session 0x6485a8006060079 for server > >>> >iwdc1preecma03.iwater.ie/10.231.72.24:2183, unexpected error, closing > >>> >socket connection and attempting reconnect > >>> > > >>> >java.io.IOException: Connection reset by peer > >>> > > >>> > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > >>> > > >>> > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > >>> > > >>> > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:225) > >>> > > >>> > at sun.nio.ch.IOUtil.read(IOUtil.java:193) > >>> > > >>> > at > sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375) > >>> > > >>> > at > >>> > >>> > >org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68) > >>> > > >>> > at > >>> > >>> > >org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355) > >>> > > >>> > at > >>> >org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) > >>> > > >>> > > >>> >Below are configurations for 1. zookeeper nodes & 2. MCF nodes for > >>> >zookeeper. > >>> > > >>> > > >>> >*zoo.cfg : Same for all six zookeeper nodes.* > >>> > > >>> > > >>> ># The number of milliseconds of each tick > >>> > > >>> >tickTime=2000 > >>> > > >>> >dataDir=/app/IW/zookeeper/data/data.1 > >>> > > >>> >dataLogDir=/app/IW/zookeeper/logs/log.1 > >>> > > >>> >clientPort=2181 > >>> > > >>> >server.1=iwdc1preecma03:2888:3888 > >>> > > >>> >server.2=iwdc1preecma03:2889:3889 > >>> > > >>> >server.3=iwdc1preecma03:2890:3890 > >>> > > >>> >server.4=iwdc2preecma04:2891:3891 > >>> > > >>> >server.5=iwdc2preecma04:2892:3892 > >>> > > >>> >server.6=iwdc2preecma04:2893:3893 > >>> > > >>> ># The number of ticks that the initial > >>> > > >>> ># synchronization phase can take > >>> > > >>> >initLimit=10 > >>> > > >>> ># The number of ticks that can pass between > >>> > > >>> ># sending a request and getting an acknowledgement > >>> > > >>> >syncLimit=5 > >>> > > >>> ># the directory where the snapshot is stored. > >>> > > >>> ># do not use /tmp for storage, /tmp here is just > >>> > > >>> ># example sakes. > >>> > > >>> >#dataDir=/tmp/zookeeper > >>> > > >>> ># the port at which the clients will connect > >>> > > >>> >#clientPort=2181 > >>> > > >>> ># the maximum number of client connections. > >>> > > >>> ># increase this if you need to handle more clients > >>> > > >>> >#maxClientCnxns=60 > >>> > > >>> ># > >>> > > >>> ># Be sure to read the maintenance section of the > >>> > > >>> ># administrator guide before turning on autopurge. > >>> > > >>> ># > >>> > > >>> ># > >>> > http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance > >>> > > >>> ># > >>> > > >>> ># The number of snapshots to retain in dataDir > >>> > > >>> >autopurge.snapRetainCount=3 > >>> > > >>> ># Purge task interval in hours > >>> > > >>> ># Set to "0" to disable auto purge feature > >>> > > >>> >autopurge.purgeInterval=1 > >>> > > >>> > > >>> > > >>> >*ManifoldCF configurations : same for both ManifoldCF nodes.* > >>> > > >>> > > >>> ><property name="org.apache.manifoldcf.lockmanagerclass" > >>> >value="org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager"/> > >>> > > >>> > <property name="org.apache.manifoldcf.zookeeper.connectstring" > >>> > >>> > >value="iwdc1preecma03:2181,iwdc1preecma03:2182,iwdc1preecma03:2183,iwdc2preecma04:2181,iwdc2preecma04:2182,iwdc2preecma04:2183"/> > >>> > > >>> ><property name="org.apache.manifoldcf.zookeeper.sessiontimeout" > >>> >value="4000"/> > >>> > > >>> > > >>> > > >>> >*I want to know if due to above warnings/errors, will zookeeper stop > >>> >working or will zookeeper will work and these are non-failing > messages, > >>> >because ManifoldCF jobs are stuck while i can see these errors.* > >>> > > >>> >Please suggest. > >>> > > >>> >Regards, > >>> >Lalit. > > > >>> > > >>> > > >>> > > >> > >> > >> > >> > >> -- > >> Regards, > >> Lalit. > >> > > > > > > > >-- > >Regards, > >Lalit. > > > > > > > -- Regards, Lalit.
