Hi Martin, how many ZooKeepers do you have? Is your transaction log on a dedicated disk? How many clients are approximately connecting?
have a look at http://zookeeper.apache.org/doc/r3.2.2/zookeeperAdmin.html#sc_bestPractices Tomas On 27 April 2015 at 10:58, Martin Stiborský <[email protected]> wrote: > Hello guys, > we are running a mesos stack on CoreOS, with three zookeeper nodes. > > We can start a docker containers with Marathon and all, that's fine, but > some of the docker containers generates high network load, while > communicating between nodes/containers and I think that' the reason why the > zookeper is failing. > From logs, I can see this error: > > Apr 27 05:06:15 epsp02.dc.vendavo.com systemd[1]: Stopping Zookeper > server... > Apr 27 05:06:45 epsp02.dc.vendavo.com docker[1155]: 2015-04-27 > 05:06:45,705 [myid:1] - WARN [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream > exception > Apr 27 05:06:45 epsp02.dc.vendavo.com docker[1155]: EndOfStreamException: > Unable to read additional data from client sessionid 0x14cf73508730003, > likely client has closed socket > Apr 27 05:06:45 epsp02.dc.vendavo.com docker[1155]: at > org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) > Apr 27 05:06:45 epsp02.dc.vendavo.com docker[1155]: at > org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) > Apr 27 05:06:45 epsp02.dc.vendavo.com docker[1155]: at > java.lang.Thread.run(Thread.java:745) > Apr 27 05:06:45 epsp02.dc.vendavo.com docker[1155]: 2015-04-27 > 05:06:45,707 [myid:1] - INFO [NIOServerCxn.Factory: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connec > tion for client /10.60.11.82:58082 which had sessionid 0x14cf73508730003 > > And then all ZK nodes goes down…mesos fails as well and that's it. The > cluster eventually do recover, but the tasks running are gone, not finished. > > I have to say I don't have a proper monitoring in place yet, working on it > right now, so I can't rely on real data to prove this assumption, but it's > my guess. > So if you can confirm that this makes sense, or share with me your > experiences, that would be pretty valuable for me right now. > > Thanks a lot! >

