Hi Martin,

how many ZooKeepers do you have? Is your transaction log on a dedicated
disk? How many clients are approximately connecting?

have a look at
http://zookeeper.apache.org/doc/r3.2.2/zookeeperAdmin.html#sc_bestPractices

Tomas

On 27 April 2015 at 10:58, Martin Stiborský <[email protected]>
wrote:

> Hello guys,
> we are running a mesos stack on CoreOS, with three zookeeper nodes.
>
> We can start a docker containers with Marathon and all, that's fine, but
> some of the docker containers generates high network load, while
> communicating between nodes/containers and I think that' the reason why the
> zookeper is failing.
> From logs, I can see this error:
>
> Apr 27 05:06:15 epsp02.dc.vendavo.com systemd[1]: Stopping Zookeper
> server...
> Apr 27 05:06:45 epsp02.dc.vendavo.com docker[1155]: 2015-04-27
> 05:06:45,705 [myid:1] - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream
> exception
> Apr 27 05:06:45 epsp02.dc.vendavo.com docker[1155]: EndOfStreamException:
> Unable to read additional data from client sessionid 0x14cf73508730003,
> likely client has closed socket
> Apr 27 05:06:45 epsp02.dc.vendavo.com docker[1155]: at
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
> Apr 27 05:06:45 epsp02.dc.vendavo.com docker[1155]: at
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
> Apr 27 05:06:45 epsp02.dc.vendavo.com docker[1155]: at
> java.lang.Thread.run(Thread.java:745)
> Apr 27 05:06:45 epsp02.dc.vendavo.com docker[1155]: 2015-04-27
> 05:06:45,707 [myid:1] - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connec
> tion for client /10.60.11.82:58082 which had sessionid 0x14cf73508730003
>
> And then all ZK nodes goes down…mesos fails as well and that's it. The
> cluster eventually do recover, but the tasks running are gone, not finished.
>
> I have to say I don't have a proper monitoring in place yet, working on it
> right now, so I can't rely on real data to prove this assumption, but it's
> my guess.
> So if you can confirm that this makes sense, or share with me your
> experiences, that would be pretty valuable for me right now.
>
> Thanks a lot!
>

Reply via email to