[DISCUSS] Proposal: APISIX: upstream health check status report

Jinhua Luo Wed, 15 Mar 2023 20:36:41 -0700

Background:

The original control API
(https://apisix.apache.org/zh/docs/apisix/control-api/#get-v1healthcheck)
has the following issues:


* The control API may be handled by any worker process. If the
handling process hasn't started the health check yet, then no data
returns. But we need a global view of the whole APISIX instance, no
matter which worker process responds to the control API request.
https://github.com/apache/apisix/issues/5953

* No HTML response format
https://github.com/apache/apisix/issues/8441

* No Prometheus metrics support

Solution:

When the health checker updates the status, posts the status to a
global event source, and each worker process registers that event
source so that each worker has a global and complete view of the
health status of all resources.

Note that only when one upstream is satisfied by the conditions below,
its status is shown in the result list:

* The upstream is configured with a health checker
* The upstream has served requests in any worker process

Response format:

The response format is determined by the Accept header. If the header
does not specify HTML explicitly, then use JSON by default.

Example:

[
  {
    "name":"upstream#/apisix/routes/1",
    "nodes":[
      {
        "host":"foo",
        "ip":"127.0.0.1",
        "port":9090,
        "status":"healthy"
      },
      {
        "host":"bar",
        "ip":"127.0.0.4",
        "port":222,
        "status":"mostly_healthy"
      },
      {
        "host":"127.0.0.2",
        "ip":"127.0.0.2",
        "port":54,
        "status":"unhealthy"
      }
    ]
  }
]

Prometheus metrics example:

upstream_status{name="upstream#/apisix/routes/1",host="foo",ip="127.0.0.1",port="89"}
1
upstream_status{name="upstream#/apisix/routes/demo",host="foo.bar",ip="192.168.23.2",port="334"}
0

[DISCUSS] Proposal: APISIX: upstream health check status report

Reply via email to