sijie closed pull request #682: Issue 681: Deploying bookkeeper in k8s using StatefulSets URL: https://github.com/apache/bookkeeper/pull/682
This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/deploy/kubernetes/gke/bookkeeper.statefulset.yml b/deploy/kubernetes/gke/bookkeeper.statefulset.yml new file mode 100644 index 000000000..d72cb2cc8 --- /dev/null +++ b/deploy/kubernetes/gke/bookkeeper.statefulset.yml @@ -0,0 +1,174 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# + +## Credits to Franck Cuny : https://github.com/fcuny/distributedlog-on-k8s/blob/master/bookkeeper.statefulset.yaml + +kind: StorageClass +apiVersion: storage.k8s.io/v1 +metadata: + name: ssd +provisioner: kubernetes.io/gce-pd +parameters: + type: pd-ssd + +--- +apiVersion: v1 +kind: ConfigMap +metadata: + name: bookie-config +data: + BK_BOOKIE_EXTRA_OPTS: "\"-Xms1g -Xmx1g -XX:MaxDirectMemorySize=1g -XX:+UseG1GC -XX:MaxGCPauseMillis=10 -XX:+ParallelRefProcEnabled -XX:+UnlockExperimentalVMOptions -XX:+AggressiveOpts -XX:+DoEscapeAnalysis -XX:ParallelGCThreads=32 -XX:ConcGCThreads=32 -XX:G1NewSizePercent=50 -XX:+DisableExplicitGC -XX:-ResizePLAB\"" + BK_bookiePort: "3181" + BK_journalDirectory: "/bookkeeper/data/journal" + BK_ledgerDirectories: "/bookkeeper/data/ledgers" + BK_indexDirectories: "/bookkeeper/data/ledgers" + BK_zkServers: zookeeper + # the default manager is flat, which is not good for supporting large number of ledgers + BK_ledgerManagerType: "hierarchical" + # TODO: Issue 458: https://github.com/apache/bookkeeper/issues/458 + #BK_statsProviderClass: org.apache.bookkeeper.stats.prometheus.PrometheusMetricsProvider + # use hostname as bookie id for StatefulSets deployment + BK_useHostNameAsBookieID: "true" +--- + +apiVersion: apps/v1beta1 +kind: StatefulSet +metadata: + name: bookie + labels: + app: bookkeeper + component: bookie +spec: + serviceName: "bookkeeper" + replicas: 3 + template: + metadata: + labels: + app: bookkeeper + component: bookie + annotations: + pod.alpha.kubernetes.io/initialized: "true" + prometheus.io/scrape: "true" + prometheus.io/port: "8000" + spec: + terminationGracePeriodSeconds: 0 + containers: + - name: bookie + image: apache/bookkeeper:latest + resources: + requests: + memory: "3Gi" + cpu: "1000m" + limits: + memory: "5Gi" + cpu: "2000m" + command: [ "/bin/bash", "/opt/bookkeeper/entrypoint.sh" ] + args: ["/opt/bookkeeper/bin/bookkeeper", "bookie"] + ports: + - name: bookie + containerPort: 3181 + envFrom: + - configMapRef: + name: bookie-config + volumeMounts: + - name: journaldisk + mountPath: /bookkeeper/data/journal + - name: ledgersdisk + mountPath: /bookkeeper/data/ledgers + + volumeClaimTemplates: + - metadata: + name: journaldisk + labels: + component: bookkeeper + spec: + accessModes: [ "ReadWriteOnce" ] + storageClassName: ssd + resources: + requests: + storage: 5Gi + - metadata: + name: ledgersdisk + annotations: + volume.alpha.kubernetes.io/storage-class: default + labels: + component: bookkeeper + spec: + accessModes: [ "ReadWriteOnce" ] + resources: + requests: + storage: 10Gi +--- +# A headless service to create DNS records +apiVersion: v1 +kind: Service +metadata: + annotations: + service.alpha.kubernetes.io/tolerate-unready-endpoints: "true" + name: bookkeeper + labels: + app: bookkeeper + component: bookie +spec: + ports: + ports: + - name: bookie + port: 3181 + protocol: TCP + clusterIP: None + selector: + app: bookkeeper + component: bookie + +--- +## +## Run BookKeeper auto-recovery from a different set of containers +## Auto-Recovery makes sure to restore the replication factor when any bookie +## crashes and it's not recovering on its own. +## +apiVersion: apps/v1beta1 +kind: Deployment +metadata: + name: bookie-autorecovery +spec: + replicas: 2 + template: + metadata: + labels: + app: bookkeeper + component: bookkeeper-replication + spec: + affinity: + podAntiAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + - labelSelector: + matchExpressions: + - key: "app" + operator: In + values: + - bookkeeper + topologyKey: "kubernetes.io/hostname" + containers: + - name: replication-worker + image: apache/bookkeeper:latest + command: [ "/bin/bash", "/opt/bookkeeper/entrypoint.sh" ] + args: ["/opt/bookkeeper/bin/bookkeeper", "autorecovery"] + envFrom: + - configMapRef: + name: bookie-config diff --git a/site/docs/latest/deployment/kubernetes.md b/site/docs/latest/deployment/kubernetes.md index d75919ce7..0f113169e 100644 --- a/site/docs/latest/deployment/kubernetes.md +++ b/site/docs/latest/deployment/kubernetes.md @@ -97,10 +97,28 @@ $ zk-shell localhost 2181 ### Deploy Bookies -Once ZooKeeper cluster is Running, you can then deploy the bookies. +Once ZooKeeper cluster is Running, you can then deploy the bookies. You can deploy the bookies either using a [DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/) or a [StatefulSet](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/). + +> NOTE: _DaemonSet_ vs _StatefulSet_ +> +> A _DaemonSet_ ensures that all (or some) nodes run a pod of bookie instance. As nodes are added to the cluster, bookie pods are added automatically to them. As nodes are removed from the +> cluster, those bookie pods are garbage collected. The bookies deployed in a DaemonSet stores data on the local disks on those nodes. So it doesn't require any external storage for Persistent +> Volumes. +> +> A _StatefulSet_ maintains a sticky identity for the pods that it runs and manages. It provides stable and unique network identifiers, and stable and persistent storage for each pod. The pods +> are not interchangeable, the idenifiers for each pod are maintained across any rescheduling. +> +> Which one to use? A _DaemonSet_ is the easiest way to deploy a bookkeeper cluster, because it doesn't require additional persistent volume provisioner and use local disks. BookKeeper manages +> the data replication. It maintains the best latency property. However, it uses `hostIP` and `hostPort` for communications between pods. In some k8s platform (such as DC/OS), `hostIP` and +> `hostPort` are not well supported. A _StatefulSet_ is only practical when deploying in a cloud environment or any K8S installation that has persistent volumes available. Also be aware, latency +> can be potentially higher when using persistent volumes, because there is usually built-in replication in the persistent volumes. ```bash +# deploy bookies in a daemon set $ kubectl apply -f bookkeeper.yaml + +# deploy bookies in a stateful set +$ kubectl apply -f bookkeeper.stateful.yaml ``` You can check on the status of the Bookie pods for these components either in the Kubernetes Dashboard or using `kubectl`: ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
