Distributed systems are hard - but most importantly, they all differ in
various ways.

>  I feel the zookeeper is almost unstable for a cluster.

this is too a general and vague statement to be either true or false (or
provide any guidance): it all depends on how you deploy your ensemble, what
hardware it runs on, what virtualization layer you use, how do you manage
failovers and recovery.

But, way more importantly, it all depends on *your* requirements: a
configuration that works perfectly fine for a few hundred nodes,
distributed across 2-3 DCs in a geographically "contained" region (eg,
North America) would be woefully inadequate for a system running across 6
global DCs, covering several thousand of nodes, with tight latency
requirements.

Outside of Google (where we would use our "own stuff" - Borg, Chubby &
friends) I've never really had any trouble with ZK - then again, maybe the
stuff I worked on, was nowhere near as complex as what you're trying to
achieve.

My suggestion would be to try it out on a staging environment, conduct some
performance and stress test, and find out whether the performance,
stability and availability of the ZK ensemble (and, consequently, of the
Mesos cluster) meet your requirements.

Hope this helps.

*Marco Massenzio*
*Distributed Systems Engineer*

On Sun, Aug 2, 2015 at 10:15 AM, tommy xiao <[email protected]> wrote:

> today i reading  ZooKeeper Resilience at Pinterest (
> https://engineering.pinterest.com/blog/zookeeper-resilience-pinterest?route=/post/%3Aid/%3Asummary),
>  I feel the zookeeper is almost unstable for a cluster.
>
> Does anyone have some experience with the zookeeper usage?
>
> --
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com
>

Reply via email to