This is an automated email from the ASF dual-hosted git repository.

dgrove pushed a commit to branch master
in repository 
https://gitbox.apache.org/repos/asf/incubator-openwhisk-deploy-kube.git


The following commit(s) were added to refs/heads/master by this push:
     new a4a008c  Configurable probes settings for zookeeper and controller 
(#478)
a4a008c is described below

commit a4a008cced82627ca8c5229e6568f2ccdfaf92f5
Author: Neeraj Mangal <[email protected]>
AuthorDate: Thu Jun 13 19:37:06 2019 +0530

    Configurable probes settings for zookeeper and controller (#478)
    
    * Configurable probes settings for zookeeper and controller
---
 docs/configurationChoices.md                 | 15 ++++++++++++++
 docs/troubleshooting.md                      |  2 ++
 helm/openwhisk/templates/controller-pod.yaml | 14 ++++++++++---
 helm/openwhisk/templates/zookeeper-pod.yaml  | 10 +++++----
 helm/openwhisk/values-metadata.yaml          | 28 +++++++++++++++++++++++++
 helm/openwhisk/values.yaml                   | 31 ++++++++++++++++++++++++++++
 6 files changed, 93 insertions(+), 7 deletions(-)

diff --git a/docs/configurationChoices.md b/docs/configurationChoices.md
index b5dbcbd..bbba727 100644
--- a/docs/configurationChoices.md
+++ b/docs/configurationChoices.md
@@ -214,3 +214,18 @@ If you want to override this default when using the 
DockerContainerFactory,
 you can set `invoker.containerFactory.networkConfig.dns.inheritInvokerConfig` 
to `false`
 and explicitly configure the child values of 
`invoker.containerFactory.networkConfig.dns.overrides`
 instead.
+
+### Customizing probes setting
+
+Many openwhisk components has liveness and readiness probes configured. 
Sometimes it is observed that components do not come up or in ready state 
before the probes starts executing which causes pods to restarts or fail. You 
can configure probes timing settings like `initialDelaySeconds`, 
`periodSeconds` and `timeoutSeconds` in `mycluster.yaml`
+
+```bash
+probes:
+  zookeeper:
+    livenessProbe:
+      initialDelaySeconds: <number of seconds>
+      periodSeconds: <number of seconds>
+      timeoutSeconds: <number of seconds>
+```
+
+**Note:** currently, probes settings are available for `zookeeper` and 
`controllers` only.
diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md
index 254fbd5..787675b 100644
--- a/docs/troubleshooting.md
+++ b/docs/troubleshooting.md
@@ -76,6 +76,8 @@ Here's what it looks like when the network is misconfigured 
and kafka is not rea
 [2018-10-18T17:30:53.433Z] [INFO] [#tid_sid_unknown] [Controller] Shutting 
down Kamon with coordinated shutdown
 ```
 
+if you have `hairpin` mode configured but still seeing above error, this can 
happen due to probes failure as well. Default liveness probe for controller is 
5 seconds, if you see similar error in controller logs, try customizing the 
prob settings to increase `initialDelaySeconds` for controller for liveness 
probe. See the customizing probes section in the [configuration choices 
documentation](./configurationChoices.md) for more details.
+
 ### wsk `cannot validate certificates` error
 
 If you installed self-signed certificates, which is the default
diff --git a/helm/openwhisk/templates/controller-pod.yaml 
b/helm/openwhisk/templates/controller-pod.yaml
index f495151..3ce9da1 100644
--- a/helm/openwhisk/templates/controller-pod.yaml
+++ b/helm/openwhisk/templates/controller-pod.yaml
@@ -66,9 +66,17 @@ spec:
             path: "/ping"
             port: {{ .Values.controller.port }}
             scheme: "HTTP"
-          initialDelaySeconds: 5
-          periodSeconds: 10
-          timeoutSeconds: 1
+          initialDelaySeconds: {{ 
.Values.probes.controller.livenessProbe.initialDelaySeconds }}
+          periodSeconds: {{ 
.Values.probes.controller.livenessProbe.periodSeconds }}
+          timeoutSeconds: {{ 
.Values.probes.controller.livenessProbe.timeoutSeconds }}
+        readinessProbe:
+          httpGet:
+            path: "/ping"
+            port: {{ .Values.controller.port }}
+            scheme: "HTTP"
+          initialDelaySeconds: {{ 
.Values.probes.controller.readinessProbe.initialDelaySeconds }}
+          periodSeconds: {{ 
.Values.probes.controller.readinessProbe.periodSeconds }}
+          timeoutSeconds: {{ 
.Values.probes.controller.readinessProbe.timeoutSeconds }}
         env:
         - name: "PORT"
           value: {{ .Values.controller.port | quote }}
diff --git a/helm/openwhisk/templates/zookeeper-pod.yaml 
b/helm/openwhisk/templates/zookeeper-pod.yaml
index 4c84558..158b988 100644
--- a/helm/openwhisk/templates/zookeeper-pod.yaml
+++ b/helm/openwhisk/templates/zookeeper-pod.yaml
@@ -60,8 +60,9 @@ spec:
         livenessProbe:
           tcpSocket:
             port: {{ .Values.zookeeper.port }}
-          initialDelaySeconds: 5
-          periodSeconds: 10
+          initialDelaySeconds: {{ 
.Values.probes.zookeeper.livenessProbe.initialDelaySeconds }}
+          periodSeconds: {{ 
.Values.probes.zookeeper.livenessProbe.periodSeconds }}
+          timeoutSeconds: {{ 
.Values.probes.zookeeper.livenessProbe.timeoutSeconds }}
         # Disabled: See issue 
https://github.com/apache/incubator-openwhisk-deploy-kube/issues/469
         # readinessProbe:
         #   exec:
@@ -69,8 +70,9 @@ spec:
         #     - /bin/bash
         #     - -c
         #     - "echo ruok | nc -w 1 localhost:{{ .Values.zookeeper.port }} | 
grep imok"
-        #   initialDelaySeconds: 5
-        #   periodSeconds: 10
+        #   initialDelaySeconds: {{ 
.Values.probes.zookeeper.readinessProbe.initialDelaySeconds }}
+        #   periodSeconds: {{ 
.Values.probes.zookeeper.readinessProbe.periodSeconds }}
+        #   timeoutSeconds: {{ 
.Values.probes.zookeeper.readinessProbe.timeoutSeconds }}
         volumeMounts:
         - mountPath: /conf
           name: zk-config
diff --git a/helm/openwhisk/values-metadata.yaml 
b/helm/openwhisk/values-metadata.yaml
index f2dfa44..9703fa8 100644
--- a/helm/openwhisk/values-metadata.yaml
+++ b/helm/openwhisk/values-metadata.yaml
@@ -1483,3 +1483,31 @@ affinity:
       description: "Label used for worker nodes that should execute OpenWhisk 
event provider pods"
       type: "string"
       required: true
+
+probes:
+  _metadata:
+  label: "Pod's liveness and readiness time related settings"
+  description: "Used for configurable liveness and readiness probes timing 
settings"
+  component:
+    _metadata:
+      label: "Specify component name i.e. zookeeper, controller etc."
+      description: "Specify component name for which probes settings will 
apply"
+    initialDelaySeconds:
+      _metadata:
+      label: "Initial wait time to start probes after container has started"
+      description: "Time in seconds to wait before starting the probes on pod 
in started state."
+      type: "number"
+      required: false
+    periodSeconds:
+      _metadata:
+      label: "Frequency to perform the probe in seconds"
+      description: "Frequency to perform the probe, default is 10, minimun is 
1"
+      type: "number"
+      required: false
+    timeoutSeconds:
+      _metadata:
+      label: "Probe timeout in seconds"
+      description: "Probe will timeouts after defined seconds, defaults to 1 
second, minimum value is 1"
+      type: "number"
+      required: false
+
diff --git a/helm/openwhisk/values.yaml b/helm/openwhisk/values.yaml
index c5ad561..d12a3e4 100644
--- a/helm/openwhisk/values.yaml
+++ b/helm/openwhisk/values.yaml
@@ -346,3 +346,34 @@ affinity:
   edgeNodeLabel: edge
   invokerNodeLabel: invoker
   providerNodeLabel: provider
+
+# Used to define the probes timing settings so that you can more precisely 
control the
+# liveness and readiness checks.
+# initialDelaySeconds - Initial wait time to start probes after container has 
started
+# periodSeconds - Frequency to perform the probe, defaults to 10, minimum 
value is 1
+# timeoutSeconds - Probe will timeouts after defined seconds, defaults to 1 
second,
+#                  minimum value is 1
+# for more information please refer - 
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#configure-probes
+# Note - for now added probes settings for zookeeper and controller only.
+#        in future all components probes timing settings should be
+#        configured here if any.
+
+probes:
+  zookeeper:
+    livenessProbe:
+      initialDelaySeconds: 5
+      periodSeconds: 10
+      timeoutSeconds: 1
+    readinessProbe:
+      initialDelaySeconds: 5
+      periodSeconds: 10
+      timeoutSeconds: 1
+  controller:
+    livenessProbe:
+      initialDelaySeconds: 5
+      periodSeconds: 10
+      timeoutSeconds: 1
+    readinessProbe:
+      initialDelaySeconds: 5
+      periodSeconds: 10
+      timeoutSeconds: 1

Reply via email to