(apisix) branch master updated: docs: add information about health check status and counter (#10547)

monkeydluffy Sun, 26 Nov 2023 21:39:34 -0800

This is an automated email from the ASF dual-hosted git repository.

monkeydluffy pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/apisix.git



The following commit(s) were added to refs/heads/master by this push:
     new a97640cef docs: add information about health check status and counter 
(#10547)
a97640cef is described below

commit a97640ceff79be34af8eacf20f08b9df277f6c1b
Author: Zhenyu Luo <[email protected]>
AuthorDate: Mon Nov 27 13:39:10 2023 +0800

    docs: add information about health check status and counter (#10547)
---
 .../images/health_check_node_state_diagram.png     | Bin 0 -> 55197 bytes
 docs/en/latest/tutorials/health-check.md           |  69 +++++++++++++++++++++
 docs/zh/latest/tutorials/health-check.md           |  69 +++++++++++++++++++++
 3 files changed, 138 insertions(+)

diff --git a/docs/assets/images/health_check_node_state_diagram.png 
b/docs/assets/images/health_check_node_state_diagram.png
new file mode 100644
index 000000000..777c5032c
Binary files /dev/null and 
b/docs/assets/images/health_check_node_state_diagram.png differ
diff --git a/docs/en/latest/tutorials/health-check.md 
b/docs/en/latest/tutorials/health-check.md
index 3de3304c6..a75fe353b 100644
--- a/docs/en/latest/tutorials/health-check.md
+++ b/docs/en/latest/tutorials/health-check.md
@@ -161,3 +161,72 @@ The health check status can be fetched via `GET 
/v1/healthcheck` in [Control API
 curl http://127.0.0.1:9090/v1/healthcheck/upstreams/healthycheck -s | jq .
 
 ```
+
+## Health Check Status
+
+APISIX provides comprehensive health check information, with particular 
emphasis on the `status` and `counter` parameters for effective health 
monitoring. In the APISIX context, nodes exhibit four states: `healthy`, 
`unhealthy`, `mostly_unhealthy`, and `mostly_healthy`. The `mostly_healthy` 
status indicates that the current node is considered healthy, but during health 
checks, the node's health status is not consistently successful. The 
`mostly_unhealthy` status indicates that the curren [...]
+
+To retrieve health check information, you can use the following curl command:
+
+```shell
+ curl -i http://127.0.0.1:9090/v1/healthcheck
+```
+
+Response Example:
+
+```json
+[
+  {
+    "nodes": {},
+    "name": "/apisix/routes/1",
+    "type": "http"
+  },
+  {
+    "nodes": [
+      {
+        "port": 1970,
+        "hostname": "127.0.0.1",
+        "status": "healthy",
+        "ip": "127.0.0.1",
+        "counter": {
+          "tcp_failure": 0,
+          "http_failure": 0,
+          "success": 0,
+          "timeout_failure": 0
+        }
+      },
+      {
+        "port": 1980,
+        "hostname": "127.0.0.1",
+        "status": "healthy",
+        "ip": "127.0.0.1",
+        "counter": {
+          "tcp_failure": 0,
+          "http_failure": 0,
+          "success": 0,
+          "timeout_failure": 0
+        }
+      }
+    ],
+    "name": "/apisix/routes/example-hc-route",
+    "type": "http"
+  }
+]
+```
+
+### State Transition Diagram
+
+![image](../../../assets/images/health_check_node_state_diagram.png)
+
+Note that all nodes start with the `healthy` status without any initial 
probes, and the counter only resets and updates with a state change. Hence, 
when nodes are `healthy` and all subsequent checks are successful, the 
`success` counter is not updated and remains zero.
+
+### Counter Information
+
+In the event of a health check failure, the `success` count in the counter 
will be reset to zero. Upon a successful health check, the `tcp_failure`, 
`http_failure`, and `timeout_failure` data will be reset to zero.
+
+| Name            | Description                            | Purpose           
                                                                                
                       |
+|----------------|----------------------------------------|--------------------------------------------------------------------------------------------------------------------------|
+| success        | Number of successful health checks     | When `success` 
exceeds the configured `healthy.successes` value, the node transitions to a 
`healthy` state.              |
+| tcp_failure    | Number of TCP health check failures    | When `tcp_failure` 
exceeds the configured `unhealthy.tcp_failures` value, the node transitions to 
an `unhealthy` state.  |
+| http_failure   | Number of HTTP health check failures   | When 
`http_failure` exceeds the configured `unhealthy.http_failures` value, the node 
transitions to an `unhealthy` state. |
+| timeout_failure | Number of health check timeouts        | When 
`timeout_failure` exceeds the configured `unhealthy.timeouts` value, the node 
transitions to an `unhealthy` state.  |
diff --git a/docs/zh/latest/tutorials/health-check.md 
b/docs/zh/latest/tutorials/health-check.md
index dc68beed6..af7eaf287 100644
--- a/docs/zh/latest/tutorials/health-check.md
+++ b/docs/zh/latest/tutorials/health-check.md
@@ -160,3 +160,72 @@ unhealthy TCP increment (2/2) for '(127.0.0.1:1980'
 curl http://127.0.0.1:9090/v1/healthcheck/upstreams/healthycheck -s | jq .
 
 ```
+
+## 健康检查信息
+
+APISIX 提供了丰富的健康检查信息，其中  `status` 以及 `counter` 的返回对于健康检查是至关重要的。在 APISIX 
中，节点有四个状态：`healthy`、`unhealthy`、`mostly_unhealthy`、`mostly_healthy`。`mostly_healthy`
 状态表示当前节点状态是健康的，但在健康检查期间，节点健康检测并不是一直是成功的。`mostly_unhealthy` 
状态表示当前节点状态是不健康的，但在健康检查期间，节点健康检测并不是一直是失败的。节点的状态转换取决于本次健康检查的成功或失败，以及 `counter` 
中记录的 `tcp_failure`、`http_failure`、`success`、`timeout_failure` 四个数据。
+
+获取健康检查信息，通过以下 curl 命令可以获取健康检查信息：
+
+```shell
+curl -i http://127.0.0.1:9090/v1/healthcheck
+```
+
+响应示例：
+
+```json
+[
+  {
+    "nodes": {},
+    "name": "/apisix/routes/1",
+    "type": "http"
+  },
+  {
+    "nodes": [
+      {
+        "port": 1970,
+        "hostname": "127.0.0.1",
+        "status": "healthy",
+        "ip": "127.0.0.1",
+        "counter": {
+          "tcp_failure": 0,
+          "http_failure": 0,
+          "success": 0,
+          "timeout_failure": 0
+        }
+      },
+      {
+        "port": 1980,
+        "hostname": "127.0.0.1",
+        "status": "healthy",
+        "ip": "127.0.0.1",
+        "counter": {
+          "tcp_failure": 0,
+          "http_failure": 0,
+          "success": 0,
+          "timeout_failure": 0
+        }
+      }
+    ],
+    "name": "/apisix/routes/example-hc-route",
+    "type": "http"
+  }
+]
+```
+
+### 状态转换图
+
+![image](../../../assets/images/health_check_node_state_diagram.png)
+
+请注意，所有节点在没有初始探测的情况下都以`healthy`状态启动，计数器仅在状态更改时重置和更新。因此，当节点处于`healthy`状态且所有后续检查都成功时，`success`计数器不会更新，保持为零。
+
+### counter 信息
+
+若健康检查失败，`counter` 中的 `success` 计数将被置零。若健康检查成功，则会将 
`tcp_failure`、`http_failure`、`timeout_failure` 数据置零。
+
+| 名称            | 描述                    | 作用                                   
                                    |
+|----------------|------------------------|----------------------------------------------------------------------------|
+|success         | 健康检查成功的次数         |当 success 大于 healthy.successes 
配置值时，节点会变为 healthy 状态               |
+|tcp_failure     | TCP 类型健康检查失败次数   |当 tcp_failure 大于 unhealthy.tcp_failures 
配置值时，节点会变为 unhealthy 状态    |
+|http_failure    | HTTP 类型的健康检查失败次数 |当 http_failure 大于 unhealthy.http_failures 
配置值时，节点会变为 unhealthy 状态 |
+|timeout_failure | 节点健康检查超时次数       |当 timeout_failure 大于 unhealthy.timeouts 
配置值时，节点会变为 unhealthy 状态    |

(apisix) branch master updated: docs: add information about health check status and counter (#10547)

Reply via email to