We are looking to use ZK as a discovery service. This ensemble will be read
from/written to by Apache ignites event bus. The event bus allows multiple
running processes to share data across nodes. This is all running in a K8s
cluster.

Our plan was to deploy a 5-7 instance ZK ensemble accessible via a fixed
service IP address. Every K8s pod that started up would be provided this IP
and thus would be able to register itself - while discovering the other
processes on the event bus.

When we proposed this, there was great concern from the software architects
that network traffic between the kubernetes pods and the ZK ensemble must
be minimized. As a result, they are requesting/requiring us to run a ZK
ensemble member on every node of our Kubernetes cluster. Given this input,
we changed plans such that for each kubernetes pod that gets started, a new
Observer instance (running as a side-car container) will dynamically join
the "core" ensemble.
This gives localhost access to ZK to the primary container.

This means that, at a minimum, we would be running at least 1 ZK ensemble
member on every node of our K8S cluster. We intend to have several hundred
nodes at least. Our concern is that ZK does not seem like it was intended
to horizontally scale in this fashion. Beyond that, the frequency with
which ensemble members would be joining/leaving the ensemble is unknown.

My question is:
What is the maximum number of ZK ensemble members that can be run within a
single ensemble, with consideration that most of those members will be
observers? What kinds of problems might this many members cause?

thanks for your feedback
Jay

Reply via email to