This is an automated email from the ASF dual-hosted git repository.

wwei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-yunikorn-site.git


The following commit(s) were added to refs/heads/master by this push:
     new 35e7393  [YUNIKORN-745] Add doc for healthcheck endpoint (#66)
35e7393 is described below

commit 35e7393ff001a3ed31796d969d933ef5bce734fe
Author: 0yukali0 <45888688+0yuka...@users.noreply.github.com>
AuthorDate: Sun Jul 18 12:00:22 2021 +0800

    [YUNIKORN-745] Add doc for healthcheck endpoint (#66)
---
 docs/api/scheduler.md | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/docs/api/scheduler.md b/docs/api/scheduler.md
index 524fef2..8bc5d22 100644
--- a/docs/api/scheduler.md
+++ b/docs/api/scheduler.md
@@ -1151,3 +1151,81 @@ Endpoint to retrieve historical data about the number of 
total containers by tim
 }
 ```
 
+
+## Endpoint healthcheck
+
+Endpoint to retrieve historical data about critical logs, negative resource on 
node/cluster/app, ...
+
+**URL** : `/ws/v1/scheduler/healthcheck`
+
+**Method** : `GET`
+
+**Auth required** : NO
+
+### Success response
+
+**Code** : `200 OK`
+
+**Content examples**
+
+```json
+{
+    "Healthy": true,
+    "HealthChecks": [
+        {
+            "Name": "Scheduling errors",
+            "Succeeded": true,
+            "Description": "Check for scheduling error entries in metrics",
+            "DiagnosisMessage": "There were 0 scheduling errors logged in the 
metrics"
+        },
+        {
+            "Name": "Failed nodes",
+            "Succeeded": true,
+            "Description": "Check for failed nodes entries in metrics",
+            "DiagnosisMessage": "There were 0 failed nodes logged in the 
metrics"
+        },
+        {
+            "Name": "Negative resources",
+            "Succeeded": true,
+            "Description": "Check for negative resources in the partitions",
+            "DiagnosisMessage": "Partitions with negative resources: []"
+        },
+        {
+            "Name": "Negative resources",
+            "Succeeded": true,
+            "Description": "Check for negative resources in the nodes",
+            "DiagnosisMessage": "Nodes with negative resources: []"
+        },
+        {
+            "Name": "Consistency of data",
+            "Succeeded": true,
+            "Description": "Check if a node's allocated resource <= total 
resource of the node",
+            "DiagnosisMessage": "Nodes with inconsistent data: []"
+        },
+        {
+            "Name": "Consistency of data",
+            "Succeeded": true,
+            "Description": "Check if total partition resource == sum of the 
node resources from the partition",
+            "DiagnosisMessage": "Partitions with inconsistent data: []"
+        },
+        {
+            "Name": "Consistency of data",
+            "Succeeded": true,
+            "Description": "Check if node total resource = allocated resource 
+ occupied resource + available resource",
+            "DiagnosisMessage": "Nodes with inconsistent data: []"
+        },
+        {
+            "Name": "Consistency of data",
+            "Succeeded": true,
+            "Description": "Check if node capacity >= allocated resources on 
the node",
+            "DiagnosisMessage": "Nodes with inconsistent data: []"
+        },
+        {
+            "Name": "Reservation check",
+            "Succeeded": true,
+            "Description": "Check the reservation nr compared to the number of 
nodes",
+            "DiagnosisMessage": "Reservation/node nr ratio: [0.000000]"
+        }
+    ]
+}
+```
\ No newline at end of file

Reply via email to