Hello, I work for the Stratosphere project [1] (think of it as an extended Hadoop/MapReduce system) and currently look whether it makes sense to use ZooKeeper to implement some functionality. First we consider to use it for counters (hadoop-like counters), and later for coordination, e.g. of iterative tasks.
We would not want to require our customers to manually deploy ZK so my question is whether there is any best practice how to integrate ZK into another (java) application? I saw that Giraph deploys the zookeeper jar file and on startup creates a config-file and then starts a jvm using ProcessBuilder [2]. They also allow you to use your existing zookeeper. Do you think this is a good approach? Are there others (e.g. run ZK in our existing jvm)? FYI: Stratosphere is a master-worker system. Initially we would start ZK on the master only, later we could start it at multiple nodes to ensure fault-tolerance. As far as I understood, the number of nodes should depend on the read/write behaviour and on the desired level of fault-tolerance, but we also have full functionality with a single node. Thank you for any hints, André Hacker [1] http://stratosphere.eu/ [2] https://apache.googlesource.com/giraph/+/old-move-to-tlp/src/main/java/org/apache/giraph/zk/ZooKeeperManager.java
