[doris] branch master updated: [typo](docs) optimization Monitoring and alarming doc (#18767)

yiguolei Mon, 17 Apr 2023 23:14:43 -0700

This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git



The following commit(s) were added to refs/heads/master by this push:
     new 1b4be46ce5 [typo](docs) optimization Monitoring and alarming doc 
(#18767)
1b4be46ce5 is described below

commit 1b4be46ce57b97f4271a0c11ba6ca76b570e3521
Author: yongkang.zhong <[email protected]>
AuthorDate: Tue Apr 18 14:14:29 2023 +0800

    [typo](docs) optimization Monitoring and alarming doc (#18767)
    
    * [typo](docs) optimization Monitoring and alarming doc
    
    * fix
---
 .../admin-manual/maint-monitor/monitor-alert.md    | 101 ++++++++++----------
 .../admin-manual/maint-monitor/monitor-alert.md    | 103 +++++++++++----------
 2 files changed, 105 insertions(+), 99 deletions(-)

diff --git a/docs/en/docs/admin-manual/maint-monitor/monitor-alert.md 
b/docs/en/docs/admin-manual/maint-monitor/monitor-alert.md
index 28b881382e..4d6993b5f3 100644
--- a/docs/en/docs/admin-manual/maint-monitor/monitor-alert.md
+++ b/docs/en/docs/admin-manual/maint-monitor/monitor-alert.md
@@ -28,9 +28,11 @@ under the License.
 
 This document mainly introduces Doris's monitoring items and how to collect 
and display them. And how to configure alarm (TODO)
 
-[Dashboard template click 
download](https://grafana.com/api/dashboards/9734/revisions/5/download)
+Dashboard template click download
 
-> Note: Before 0.9.0 (excluding), please use revision 1. For version 0.9.x, 
use revision 2. For version 0.10.x, use revision 3. For version 1.1.x, use 
revision 4. For version 1.2.x, use revision 5.
+| Doris Version | Dashboard Version                                            
              |
+|---------------|----------------------------------------------------------------------------|
+| 1.2.x         | [revision 
5](https://grafana.com/api/dashboards/9734/revisions/5/download) |
 
 Dashboard templates are updated from time to time. The way to update the 
template is shown in the last section.
 
@@ -62,59 +64,60 @@ Doris's monitoring data is exposed through the HTTP 
interface of Frontend and Ba
 
 Users will see the following monitoring item results (for example, FE partial 
monitoring items):
 
-    ```
-    # HELP  jvm_heap_size_bytes jvm heap stat
-    # TYPE  jvm_heap_size_bytes gauge
-    jvm_heap_size_bytes{type="max"} 41661235200
-    jvm_heap_size_bytes{type="committed"} 19785285632
-    jvm_heap_size_bytes{type="used"} 10113221064
-    # HELP  jvm_non_heap_size_bytes jvm non heap stat
-    # TYPE  jvm_non_heap_size_bytes gauge
-    jvm_non_heap_size_bytes{type="committed"} 105295872
-    jvm_non_heap_size_bytes{type="used"} 103184784
-    # HELP  jvm_young_size_bytes jvm young mem pool stat
-    # TYPE  jvm_young_size_bytes gauge
-    jvm_young_size_bytes{type="used"} 6505306808
-    jvm_young_size_bytes{type="peak_used"} 10308026368
-    jvm_young_size_bytes{type="max"} 10308026368
-    # HELP  jvm_old_size_bytes jvm old mem pool stat
-    # TYPE  jvm_old_size_bytes gauge
-    jvm_old_size_bytes{type="used"} 3522435544
-    jvm_old_size_bytes{type="peak_used"} 6561017832
-    jvm_old_size_bytes{type="max"} 30064771072
-    # HELP  jvm_direct_buffer_pool_size_bytes jvm direct buffer pool stat
-    # TYPE  jvm_direct_buffer_pool_size_bytes gauge
-    jvm_direct_buffer_pool_size_bytes{type="count"} 91
-    jvm_direct_buffer_pool_size_bytes{type="used"} 226135222
-    jvm_direct_buffer_pool_size_bytes{type="capacity"} 226135221
-    # HELP  jvm_young_gc jvm young gc stat
-    # TYPE  jvm_young_gc gauge
-    jvm_young_gc{type="count"} 2186
-    jvm_young_gc{type="time"} 93650
-    # HELP  jvm_old_gc jvm old gc stat
-    # TYPE  jvm_old_gc gauge
-    jvm_old_gc{type="count"} 21
-    jvm_old_gc{type="time"} 58268
-    # HELP  jvm_thread jvm thread stat
-    # TYPE  jvm_thread gauge
-    jvm_thread{type="count"} 767
-    jvm_thread{type="peak_count"} 831
-    ...
-    ```
+```
+# HELP  jvm_heap_size_bytes jvm heap stat
+# TYPE  jvm_heap_size_bytes gauge
+jvm_heap_size_bytes{type="max"} 8476557312
+jvm_heap_size_bytes{type="committed"} 1007550464
+jvm_heap_size_bytes{type="used"} 156375280
+# HELP  jvm_non_heap_size_bytes jvm non heap stat
+# TYPE  jvm_non_heap_size_bytes gauge
+jvm_non_heap_size_bytes{type="committed"} 194379776
+jvm_non_heap_size_bytes{type="used"} 188201864
+# HELP  jvm_young_size_bytes jvm young mem pool stat
+# TYPE  jvm_young_size_bytes gauge
+jvm_young_size_bytes{type="used"} 40652376
+jvm_young_size_bytes{type="peak_used"} 277938176
+jvm_young_size_bytes{type="max"} 907345920
+# HELP  jvm_old_size_bytes jvm old mem pool stat
+# TYPE  jvm_old_size_bytes gauge
+jvm_old_size_bytes{type="used"} 114633448
+jvm_old_size_bytes{type="peak_used"} 114633448
+jvm_old_size_bytes{type="max"} 7455834112
+# HELP  jvm_young_gc jvm young gc stat
+# TYPE  jvm_young_gc gauge
+jvm_young_gc{type="count"} 247
+jvm_young_gc{type="time"} 860
+# HELP  jvm_old_gc jvm old gc stat
+# TYPE  jvm_old_gc gauge
+jvm_old_gc{type="count"} 3
+jvm_old_gc{type="time"} 211
+# HELP  jvm_thread jvm thread stat
+# TYPE  jvm_thread gauge
+jvm_thread{type="count"} 162
+jvm_thread{type="peak_count"} 205
+jvm_thread{type="new_count"} 0
+jvm_thread{type="runnable_count"} 48
+jvm_thread{type="blocked_count"} 1
+jvm_thread{type="waiting_count"} 41
+jvm_thread{type="timed_waiting_count"} 72
+jvm_thread{type="terminated_count"} 0
+...
+```
     
 This is a monitoring data presented in [Prometheus 
Format](https://prometheus.io/docs/practices/naming/). We take one of these 
monitoring items as an example to illustrate:
 
 ```
 # HELP  jvm_heap_size_bytes jvm heap stat
 # TYPE  jvm_heap_size_bytes gauge
-jvm_heap_size_bytes{type="max"} 41661235200
-jvm_heap_size_bytes{type="committed"} 19785285632
-jvm_heap_size_bytes{type="used"} 10113221064
+jvm_heap_size_bytes{type="max"} 8476557312
+jvm_heap_size_bytes{type="committed"} 1007550464
+jvm_heap_size_bytes{type="used"} 156375280
 ```
 
 1. Behavior commentary line at the beginning of "#". HELP is the description 
of the monitored item; TYPE represents the data type of the monitored item, and 
Gauge is the scalar data in the example. There are also Counter, Histogram and 
other data types. Specifically, you can see [Prometheus Official 
Document](https://prometheus.io/docs/practices/instrumentation/#counter-vs.-gauge,-summary-vs.-histogram).
 2. `jvm_heap_size_bytes` is the name of the monitored item (Key); `type= 
"max"` is a label named `type`, with a value of `max`. A monitoring item can 
have multiple Labels.
-3. The final number, such as `41661235200`, is the monitored value.
+3. The final number, such as `8476557312`, is the monitored value.
 
 ## Monitoring Architecture
 
@@ -133,7 +136,7 @@ Please start building the monitoring system after you have 
completed the deploym
 
 Prometheus
 
-1. Download the latest version of Prometheus on the [Prometheus 
Website](https://prometheus.io/download/). Here we take version 
2.3.2-linux-amd64 as an example.
+1. Download the latest version of Prometheus on the [Prometheus 
Website](https://prometheus.io/download/) or [click to 
download](https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/monitor/prometheus-2.43.0.linux-amd64.tar.gz).
 Here we take version 2.43.0-linux-amd64 as an example.
 2. Unzip the downloaded tar file on the machine that is ready to run the 
monitoring service.
 3. Open the configuration file prometheus.yml. Here we provide an example 
configuration and explain it (the configuration file is in YML format, pay 
attention to uniform indentation and spaces):
 
@@ -156,7 +159,7 @@ Prometheus
     # Here it's Prometheus itself.
     scrape_configs:
       # The job name is added as a label `job=<job_name>` to any timeseries 
scraped from this config.
-      - job_name: 'PALO_CLUSTER' # Each Doris cluster, we call it a job. Job 
can be given a name here as the name of Doris cluster in the monitoring system.
+      - job_name: 'DORIS_CLUSTER' # Each Doris cluster, we call it a job. Job 
can be given a name here as the name of Doris cluster in the monitoring system.
         metrics_path: '/metrics' # Here you specify the restful API to get the 
monitors. With host: port in the following targets, Prometheus will eventually 
collect monitoring items through host: port/metrics_path.
         static_configs: # Here we begin to configure the target addresses of 
FE and BE, respectively. All FE and BE are written into their respective groups.
           - targets: ['fe_host1:8030', 'fe_host2:8030', 'fe_host3:8030']
@@ -167,7 +170,7 @@ Prometheus
             labels:
               group: be # Here configure the group of be, which contains three 
Backends
     
-      - job_name: 'PALO_CLUSTER_2' # We can monitor multiple Doris clusters in 
a Prometheus, where we begin the configuration of another Doris cluster. 
Configuration is the same as above, the following is outlined.
+      - job_name: 'DORIS_CLUSTER_2' # We can monitor multiple Doris clusters 
in a Prometheus, where we begin the configuration of another Doris cluster. 
Configuration is the same as above, the following is outlined.
         metrics_path: '/metrics'
         static_configs: 
           - targets: ['fe_host1:8030', 'fe_host2:8030', 'fe_host3:8030']
@@ -200,7 +203,7 @@ Prometheus
 
 ### Grafana
 
-1. Download the latest version of Grafana on [Grafana's official 
website](https://grafana.com/grafana/download). Here we take version 
5.2.1.linux-amd64 as an example.
+1. Download the latest version of Grafana on [Grafana's official 
website](https://grafana.com/grafana/download) or [click to 
download](https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/monitor/grafana-enterprise-8.5.22.linux-amd64.tar.gz).
 Here we take version 8.5.22.linux-amd64 as an example.
 
 2. Unzip the downloaded tar file on the machine that is ready to run the 
monitoring service.
 
diff --git a/docs/zh-CN/docs/admin-manual/maint-monitor/monitor-alert.md 
b/docs/zh-CN/docs/admin-manual/maint-monitor/monitor-alert.md
index b242a733d5..f64fca067d 100644
--- a/docs/zh-CN/docs/admin-manual/maint-monitor/monitor-alert.md
+++ b/docs/zh-CN/docs/admin-manual/maint-monitor/monitor-alert.md
@@ -28,9 +28,11 @@ under the License.
 
 本文档主要介绍 Doris 的监控项及如何采集、展示监控项。以及如何配置报警（TODO）
 
-[Dashboard 
模板点击下载](https://grafana.com/api/dashboards/9734/revisions/5/download)
+Dashboard 模板点击下载
 
-> 注：0.9.0（不含）之前的版本请使用 revision 1。0.9.x 版本请使用 revision 2。0.10.x 版本请使用 revision 
3。1.1.x 版本请使用 revision 4 。1.2.x 版本请使用 revision 5
+| Doris 版本    | Dashboard 版本                                                   
            |
+|--------------|----------------------------------------------------------------------------|
+| 1.2.x        | [revision 
5](https://grafana.com/api/dashboards/9734/revisions/5/download) |
 
 Dashboard 模板会不定期更新。更新模板的方式见最后一小节。
 
@@ -62,59 +64,60 @@ Doris 的监控数据通过 Frontend 和 Backend 的 http 接口向外暴露。
 
 用户将看到如下监控项结果（示例为 FE 部分监控项）：
 
-    ```
-    # HELP  jvm_heap_size_bytes jvm heap stat
-    # TYPE  jvm_heap_size_bytes gauge
-    jvm_heap_size_bytes{type="max"} 41661235200
-    jvm_heap_size_bytes{type="committed"} 19785285632
-    jvm_heap_size_bytes{type="used"} 10113221064
-    # HELP  jvm_non_heap_size_bytes jvm non heap stat
-    # TYPE  jvm_non_heap_size_bytes gauge
-    jvm_non_heap_size_bytes{type="committed"} 105295872
-    jvm_non_heap_size_bytes{type="used"} 103184784
-    # HELP  jvm_young_size_bytes jvm young mem pool stat
-    # TYPE  jvm_young_size_bytes gauge
-    jvm_young_size_bytes{type="used"} 6505306808
-    jvm_young_size_bytes{type="peak_used"} 10308026368
-    jvm_young_size_bytes{type="max"} 10308026368
-    # HELP  jvm_old_size_bytes jvm old mem pool stat
-    # TYPE  jvm_old_size_bytes gauge
-    jvm_old_size_bytes{type="used"} 3522435544
-    jvm_old_size_bytes{type="peak_used"} 6561017832
-    jvm_old_size_bytes{type="max"} 30064771072
-    # HELP  jvm_direct_buffer_pool_size_bytes jvm direct buffer pool stat
-    # TYPE  jvm_direct_buffer_pool_size_bytes gauge
-    jvm_direct_buffer_pool_size_bytes{type="count"} 91
-    jvm_direct_buffer_pool_size_bytes{type="used"} 226135222
-    jvm_direct_buffer_pool_size_bytes{type="capacity"} 226135221
-    # HELP  jvm_young_gc jvm young gc stat
-    # TYPE  jvm_young_gc gauge
-    jvm_young_gc{type="count"} 2186
-    jvm_young_gc{type="time"} 93650
-    # HELP  jvm_old_gc jvm old gc stat
-    # TYPE  jvm_old_gc gauge
-    jvm_old_gc{type="count"} 21
-    jvm_old_gc{type="time"} 58268
-    # HELP  jvm_thread jvm thread stat
-    # TYPE  jvm_thread gauge
-    jvm_thread{type="count"} 767
-    jvm_thread{type="peak_count"} 831
-    ...
-    ```
+```
+# HELP  jvm_heap_size_bytes jvm heap stat
+# TYPE  jvm_heap_size_bytes gauge
+jvm_heap_size_bytes{type="max"} 8476557312
+jvm_heap_size_bytes{type="committed"} 1007550464
+jvm_heap_size_bytes{type="used"} 156375280
+# HELP  jvm_non_heap_size_bytes jvm non heap stat
+# TYPE  jvm_non_heap_size_bytes gauge
+jvm_non_heap_size_bytes{type="committed"} 194379776
+jvm_non_heap_size_bytes{type="used"} 188201864
+# HELP  jvm_young_size_bytes jvm young mem pool stat
+# TYPE  jvm_young_size_bytes gauge
+jvm_young_size_bytes{type="used"} 40652376
+jvm_young_size_bytes{type="peak_used"} 277938176
+jvm_young_size_bytes{type="max"} 907345920
+# HELP  jvm_old_size_bytes jvm old mem pool stat
+# TYPE  jvm_old_size_bytes gauge
+jvm_old_size_bytes{type="used"} 114633448
+jvm_old_size_bytes{type="peak_used"} 114633448
+jvm_old_size_bytes{type="max"} 7455834112
+# HELP  jvm_young_gc jvm young gc stat
+# TYPE  jvm_young_gc gauge
+jvm_young_gc{type="count"} 247
+jvm_young_gc{type="time"} 860
+# HELP  jvm_old_gc jvm old gc stat
+# TYPE  jvm_old_gc gauge
+jvm_old_gc{type="count"} 3
+jvm_old_gc{type="time"} 211
+# HELP  jvm_thread jvm thread stat
+# TYPE  jvm_thread gauge
+jvm_thread{type="count"} 162
+jvm_thread{type="peak_count"} 205
+jvm_thread{type="new_count"} 0
+jvm_thread{type="runnable_count"} 48
+jvm_thread{type="blocked_count"} 1
+jvm_thread{type="waiting_count"} 41
+jvm_thread{type="timed_waiting_count"} 72
+jvm_thread{type="terminated_count"} 0
+...
+```
 
 这是一个以 [Prometheus 格式](https://prometheus.io/docs/practices/naming/) 
呈现的监控数据。我们以其中一个监控项为例进行说明：
 
 ```
 # HELP  jvm_heap_size_bytes jvm heap stat
 # TYPE  jvm_heap_size_bytes gauge
-jvm_heap_size_bytes{type="max"} 41661235200
-jvm_heap_size_bytes{type="committed"} 19785285632
-jvm_heap_size_bytes{type="used"} 10113221064
+jvm_heap_size_bytes{type="max"} 8476557312
+jvm_heap_size_bytes{type="committed"} 1007550464
+jvm_heap_size_bytes{type="used"} 156375280
 ```
 
 1. "#" 开头的行为注释行。其中 HELP 为该监控项的描述说明；TYPE 表示该监控项的数据类型，示例中为 Gauge，即标量数据。还有 
Counter、Histogram 等数据类型。具体可见 [Prometheus 
官方文档](https://prometheus.io/docs/practices/instrumentation/#counter-vs.-gauge,-summary-vs.-histogram)
 。
 2. `jvm_heap_size_bytes` 即监控项的名称（Key）；`type="max"` 即为一个名为 `type` 的 Label，值为 
`max`。一个监控项可以有多个 Label。
-3. 最后的数字，如 `41661235200`，即为监控数值。
+3. 最后的数字，如 `8476557312`，即为监控数值。
 
 ## 监控架构
 
@@ -133,10 +136,10 @@ jvm_heap_size_bytes{type="used"} 10113221064
 
 ### Prometheus
 
-1. 在 [Prometheus 官网](https://prometheus.io/download/) 下载最新版本的 Prometheus。这里我们以 
2.3.2-linux-amd64 版本为例。
+1. 在 [Prometheus 官网](https://prometheus.io/download/) 下载最新版本的 Prometheus 
或者直接[点击下载](https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/monitor/prometheus-2.43.0.linux-amd64.tar.gz)。这里我们以
 2.43.0-linux-amd64 版本为例。
 2. 在准备运行监控服务的机器上，解压下载后的 tar 文件。
 3. 打开配置文件 prometheus.yml。这里我们提供一个示例配置并加以说明（配置文件为 yml 格式，一定注意统一的缩进和空格）：
- 
+
     这里我们使用最简单的静态文件的方式进行监控配置。Prometheus 支持多种 
[服务发现](https://prometheus.io/docs/prometheus/latest/configuration/configuration/)
 方式，可以动态的感知节点的加入和删除。
  
     ```
@@ -156,7 +159,7 @@ jvm_heap_size_bytes{type="used"} 10113221064
     # Here it's Prometheus itself.
     scrape_configs:
       # The job name is added as a label `job=<job_name>` to any timeseries 
scraped from this config.
-      - job_name: 'PALO_CLUSTER' # 每一个 Doris 集群，我们称为一个 job。这里可以给 job 取一个名字，作为 
Doris 集群在监控系统中的名字。
+      - job_name: 'DORIS_CLUSTER' # 每一个 Doris 集群，我们称为一个 job。这里可以给 job 取一个名字，作为 
Doris 集群在监控系统中的名字。
         metrics_path: '/metrics' # 这里指定获取监控项的 restful api。配合下面的 targets 中的 
host:port，Prometheus 最终会通过 host:port/metrics_path 来采集监控项。
         static_configs: # 这里开始分别配置 FE 和 BE 的目标地址。所有的 FE 和 BE 都分别写入各自的 group 中。
           - targets: ['fe_host1:8030', 'fe_host2:8030', 'fe_host3:8030']
@@ -167,7 +170,7 @@ jvm_heap_size_bytes{type="used"} 10113221064
             labels:
               group: be # 这里配置了 be 的 group，该 group 中包含了 3 个 Backends
     
-      - job_name: 'PALO_CLUSTER_2' # 我们可以在一个 Prometheus 中监控多个 Doris 集群，这里开始另一个 
Doris 集群的配置。配置同上，以下略。
+      - job_name: 'DORIS_CLUSTER_2' # 我们可以在一个 Prometheus 中监控多个 Doris 
集群，这里开始另一个 Doris 集群的配置。配置同上，以下略。
         metrics_path: '/metrics'
         static_configs: 
           - targets: ['fe_host1:8030', 'fe_host2:8030', 'fe_host3:8030']
@@ -200,7 +203,7 @@ jvm_heap_size_bytes{type="used"} 10113221064
 
 ### Grafana
 
-1. 在 [Grafana 官网](https://grafana.com/grafana/download) 下载最新版本的 Grafana。这里我们以 
5.2.1.linux-amd64 版本为例。
+1. 在 [Grafana 官网](https://grafana.com/grafana/download) 下载最新版本的 Grafana 
或者直接[点击下载](https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/monitor/grafana-enterprise-8.5.22.linux-amd64.tar.gz)。这里我们以
 8.5.22.linux-amd64 版本为例。
 
 2. 在准备运行监控服务的机器上，解压下载后的 tar 文件。
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[doris] branch master updated: [typo](docs) optimization Monitoring and alarming doc (#18767)

Reply via email to