tillrohrmann commented on a change in pull request #14254: URL: https://github.com/apache/flink/pull/14254#discussion_r532466031
########## File path: docs/deployment/ha/zookeeper_ha.md ########## @@ -23,113 +23,104 @@ specific language governing permissions and limitations under the License. --> -## ZooKeeper HA Services +Flink's ZooKeeper HA services use [ZooKeeper](http://zookeeper.apache.org) for high availability services. -One high availability services implementation uses ZooKeeper. +* Toc +{:toc} -### Configuration +Flink leverages **[ZooKeeper](http://zookeeper.apache.org)** for *distributed coordination* between all running JobManager instances. +ZooKeeper is a separate service from Flink, which provides highly reliable distributed coordination via leader election and light-weight consistent state storage. +Check out [ZooKeeper's Getting Started Guide](http://zookeeper.apache.org/doc/current/zookeeperStarted.html) for more information about ZooKeeper. +Flink includes scripts to [bootstrap a simple ZooKeeper](#bootstrap-zookeeper) installation. -To enable JobManager High Availability you have to set the **high-availability mode** to *zookeeper*, configure a **ZooKeeper quorum** and set up a **masters file** with all JobManagers hosts and their web UI ports. +## Configuration -Flink leverages **[ZooKeeper](http://zookeeper.apache.org)** for *distributed coordination* between all running JobManager instances. ZooKeeper is a separate service from Flink, which provides highly reliable distributed coordination via leader election and light-weight consistent state storage. Check out [ZooKeeper's Getting Started Guide](http://zookeeper.apache.org/doc/current/zookeeperStarted.html) for more information about ZooKeeper. Flink includes scripts to [bootstrap a simple ZooKeeper](#bootstrap-zookeeper) installation. +In order to start an HA-cluster you have to configure the following configuration keys: -#### Masters File (masters) - -In order to start an HA-cluster configure the *masters* file in `conf/masters`: - -- **masters file**: The *masters file* contains all hosts, on which JobManagers are started, and the ports to which the web user interface binds. - - <pre> -jobManagerAddress1:webUIPort1 -[...] -jobManagerAddressX:webUIPortX - </pre> - -By default, the job manager will pick a *random port* for inter process communication. You can change this via the **`high-availability.jobmanager.port`** key. This key accepts single ports (e.g. `50010`), ranges (`50000-50025`), or a combination of both (`50010,50011,50020-50025,50050-50075`). - -#### Config File (flink-conf.yaml) - -In order to start an HA-cluster add the following configuration keys to `conf/flink-conf.yaml`: - -- **high-availability mode** (required): The *high-availability mode* has to be set in `conf/flink-conf.yaml` to *zookeeper* in order to enable high availability mode. -Alternatively this option can be set to FQN of factory class Flink should use to create HighAvailabilityServices instance. +- **high-availability mode** (required): +The `high-availability` option has to be set to *zookeeper*. <pre>high-availability: zookeeper</pre> -- **ZooKeeper quorum** (required): A *ZooKeeper quorum* is a replicated group of ZooKeeper servers, which provide the distributed coordination service. +- **ZooKeeper quorum** (required): +A *ZooKeeper quorum* is a replicated group of ZooKeeper servers, which provide the distributed coordination service. <pre>high-availability.zookeeper.quorum: address1:2181[,...],addressX:2181</pre> Each *addressX:port* refers to a ZooKeeper server, which is reachable by Flink at the given address and port. -- **ZooKeeper root** (recommended): The *root ZooKeeper node*, under which all cluster nodes are placed. +- **ZooKeeper root** (recommended): +The *root ZooKeeper node*, under which all cluster nodes are placed. - <pre>high-availability.zookeeper.path.root: /flink + <pre>high-availability.zookeeper.path.root: /flink</pre> -- **ZooKeeper cluster-id** (recommended): The *cluster-id ZooKeeper node*, under which all required coordination data for a cluster is placed. +- **ZooKeeper cluster-id** (recommended): +The *cluster-id ZooKeeper node*, under which all required coordination data for a cluster is placed. <pre>high-availability.cluster-id: /default_ns # important: customize per cluster</pre> - **Important**: You should not set this value manually when running a YARN - cluster, a per-job YARN session, or on another cluster manager. In those - cases a cluster-id is automatically being generated based on the application - id. Manually setting a cluster-id overrides this behaviour in YARN. - Specifying a cluster-id with the -z CLI option, in turn, overrides manual - configuration. If you are running multiple Flink HA clusters on bare metal, - you have to manually configure separate cluster-ids for each cluster. + **Important**: + You should not set this value manually when running on YARN, native Kubernetes or on another cluster manager. + In those cases a cluster-id is automatically being generated. Review comment: being automatically generated sounds better to me. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
