This is an automated email from the ASF dual-hosted git repository.
HTHou pushed a commit to branch codex/prometheus
in repository https://gitbox.apache.org/repos/asf/iotdb.git
The following commit(s) were added to refs/heads/codex/prometheus by this push:
new 94ee4021da4 Document IoTDB metric self-scrape example
94ee4021da4 is described below
commit 94ee4021da4e40548a8e0bb93601c69e60be592b
Author: HTHou <[email protected]>
AuthorDate: Mon Jun 15 18:19:14 2026 +0800
Document IoTDB metric self-scrape example
---
external-service-impl/metric-scrape/README.md | 42 ++++++++++++++++--------
external-service-impl/metric-scrape/README_ZH.md | 42 ++++++++++++++++--------
2 files changed, 58 insertions(+), 26 deletions(-)
diff --git a/external-service-impl/metric-scrape/README.md
b/external-service-impl/metric-scrape/README.md
index ea0f2f7ce9e..83bf1a4b4a7 100644
--- a/external-service-impl/metric-scrape/README.md
+++ b/external-service-impl/metric-scrape/README.md
@@ -27,7 +27,15 @@ The Metric Scrape Service actively scrapes Prometheus text
exposition endpoints
Configure the service in `iotdb-system.properties` and restart IoTDB for the
changes to take effect.
+For example, to let IoTDB scrape its own ConfigNode and DataNode metrics,
first enable the Prometheus metric reporters, and then configure the scrape
targets:
+
```properties
+cn_metric_reporter_list=PROMETHEUS
+cn_metric_prometheus_reporter_port=9091
+
+dn_metric_reporter_list=PROMETHEUS
+dn_metric_prometheus_reporter_port=9092
+
enable_metric_scrape_service=true
metric_scrape_targets=http://127.0.0.1:9091/metrics,http://127.0.0.1:9092/metrics
metric_scrape_database=confignode,datanode
@@ -43,7 +51,7 @@ metric_scrape_http_timeout_ms=10000
| `metric_scrape_interval_seconds` | `15` | Scrape interval for each target,
in seconds. |
| `metric_scrape_http_timeout_ms` | `10000` | HTTP connect and read timeout,
in milliseconds. |
-When multiple targets are configured, `metric_scrape_database` works like
Prometheus job separation. For example, the first target above is written to
database `confignode`, and the second target is written to database `datanode`.
+When multiple targets are configured, `metric_scrape_database` works like
Prometheus job separation. In the IoTDB self-scrape example above, the
ConfigNode metrics from `9091` are written to database `confignode`, and the
DataNode metrics from `9092` are written to database `datanode`.
## Data Model
@@ -60,38 +68,46 @@ The service uses IoTDB table model writes and lets IoTDB
create schemas automati
For histogram, summary, and other suffixed samples, the service chooses the
longest matching `# HELP` metric family name as the table name. For example,
`request_duration_seconds_sum` and `request_duration_seconds_count` are written
to table `request_duration_seconds` when that family appears in a `# HELP` line.
-## Example
+## IoTDB Self-Scrape Example
-Prometheus text:
+After the Prometheus metric reporters are enabled, the ConfigNode and DataNode
`/metrics` endpoints expose IoTDB metrics in Prometheus text format. A metric
family may look like this:
```text
-# HELP request_duration_seconds Request duration in seconds.
-# TYPE request_duration_seconds summary
-request_duration_seconds_sum{method="query",status="ok"} 10.5
-request_duration_seconds_count{method="query",status="ok"} 3
+# HELP jvm_memory_used_bytes
+# TYPE jvm_memory_used_bytes gauge
+jvm_memory_used_bytes{area="heap",id="G1 Eden Space"} 1.2345678E7
+jvm_memory_used_bytes{area="nonheap",id="Metaspace"} 2.3456789E7
```
With this configuration:
```properties
+cn_metric_reporter_list=PROMETHEUS
+cn_metric_prometheus_reporter_port=9091
+
+dn_metric_reporter_list=PROMETHEUS
+dn_metric_prometheus_reporter_port=9092
+
enable_metric_scrape_service=true
-metric_scrape_targets=http://127.0.0.1:9091/metrics
-metric_scrape_database=metrics
+metric_scrape_targets=http://127.0.0.1:9091/metrics,http://127.0.0.1:9092/metrics
+metric_scrape_database=confignode,datanode
```
-IoTDB writes rows into database `metrics`, table `request_duration_seconds`.
The labels `method` and `status` are tag columns, and
`request_duration_seconds_sum` and `request_duration_seconds_count` are field
columns.
+IoTDB writes ConfigNode metrics into database `confignode` and DataNode
metrics into database `datanode`. For the sample above, the table is
`jvm_memory_used_bytes`, the labels `area` and `id` are tag columns, and
`jvm_memory_used_bytes` is the field column.
Query the scraped data:
```sql
-USE metrics;
+USE datanode;
-SELECT time, method, status, request_duration_seconds_sum,
request_duration_seconds_count
-FROM request_duration_seconds
+SELECT time, area, id, jvm_memory_used_bytes
+FROM jvm_memory_used_bytes
ORDER BY time DESC
LIMIT 10;
```
+The concrete table names, tag columns, and field columns depend on the metric
families and labels exposed by the current IoTDB `/metrics` endpoint.
+
## Notes
- The target response must use the Prometheus text exposition sample format.
diff --git a/external-service-impl/metric-scrape/README_ZH.md
b/external-service-impl/metric-scrape/README_ZH.md
index 8daf385f34c..ac8643696fb 100644
--- a/external-service-impl/metric-scrape/README_ZH.md
+++ b/external-service-impl/metric-scrape/README_ZH.md
@@ -27,7 +27,15 @@ Metric Scrape Service 会主动拉取 Prometheus text exposition 端点,并通
在 `iotdb-system.properties` 中配置该服务,修改后需要重启 IoTDB 才能生效。
+例如,需要让 IoTDB 采集自身 ConfigNode 和 DataNode 的监控数据时,可以先启用 Prometheus metric
reporter,再配置 scrape 目标:
+
```properties
+cn_metric_reporter_list=PROMETHEUS
+cn_metric_prometheus_reporter_port=9091
+
+dn_metric_reporter_list=PROMETHEUS
+dn_metric_prometheus_reporter_port=9092
+
enable_metric_scrape_service=true
metric_scrape_targets=http://127.0.0.1:9091/metrics,http://127.0.0.1:9092/metrics
metric_scrape_database=confignode,datanode
@@ -43,7 +51,7 @@ metric_scrape_http_timeout_ms=10000
| `metric_scrape_interval_seconds` | `15` | 每个目标的拉取间隔,单位为秒。 |
| `metric_scrape_http_timeout_ms` | `10000` | HTTP 连接和读取超时时间,单位为毫秒。 |
-配置多个目标时,`metric_scrape_database` 的作用类似 Prometheus 中按 job 区分数据。例如上面的第一个目标会写入数据库
`confignode`,第二个目标会写入数据库 `datanode`。
+配置多个目标时,`metric_scrape_database` 的作用类似 Prometheus 中按 job 区分数据。在上面的 IoTDB
自采集示例中,来自 `9091` 的 ConfigNode 监控会写入数据库 `confignode`,来自 `9092` 的 DataNode
监控会写入数据库 `datanode`。
## 数据建模
@@ -60,38 +68,46 @@ metric_scrape_http_timeout_ms=10000
对于 histogram、summary 等带后缀的 sample,服务会选择最长匹配的 `# HELP` metric family
名称作为表名。例如存在 `# HELP request_duration_seconds ...`
时,`request_duration_seconds_sum` 和 `request_duration_seconds_count` 都会写入表
`request_duration_seconds`。
-## 示例
+## 采集 IoTDB 自身监控示例
-Prometheus text:
+启用 Prometheus metric reporter 后,ConfigNode 和 DataNode 的 `/metrics` 端点会暴露
Prometheus text 格式的 IoTDB 监控数据。某个 metric family 可能如下:
```text
-# HELP request_duration_seconds Request duration in seconds.
-# TYPE request_duration_seconds summary
-request_duration_seconds_sum{method="query",status="ok"} 10.5
-request_duration_seconds_count{method="query",status="ok"} 3
+# HELP jvm_memory_used_bytes
+# TYPE jvm_memory_used_bytes gauge
+jvm_memory_used_bytes{area="heap",id="G1 Eden Space"} 1.2345678E7
+jvm_memory_used_bytes{area="nonheap",id="Metaspace"} 2.3456789E7
```
使用如下配置:
```properties
+cn_metric_reporter_list=PROMETHEUS
+cn_metric_prometheus_reporter_port=9091
+
+dn_metric_reporter_list=PROMETHEUS
+dn_metric_prometheus_reporter_port=9092
+
enable_metric_scrape_service=true
-metric_scrape_targets=http://127.0.0.1:9091/metrics
-metric_scrape_database=metrics
+metric_scrape_targets=http://127.0.0.1:9091/metrics,http://127.0.0.1:9092/metrics
+metric_scrape_database=confignode,datanode
```
-IoTDB 会将数据写入数据库 `metrics`、表 `request_duration_seconds`。其中 labels `method` 和
`status` 是 tag 列,`request_duration_seconds_sum` 和
`request_duration_seconds_count` 是 field 列。
+IoTDB 会将 ConfigNode 监控写入数据库 `confignode`,将 DataNode 监控写入数据库
`datanode`。对于上面的样例,表名为 `jvm_memory_used_bytes`,labels `area` 和 `id` 是 tag
列,`jvm_memory_used_bytes` 是 field 列。
查询拉取到的数据:
```sql
-USE metrics;
+USE datanode;
-SELECT time, method, status, request_duration_seconds_sum,
request_duration_seconds_count
-FROM request_duration_seconds
+SELECT time, area, id, jvm_memory_used_bytes
+FROM jvm_memory_used_bytes
ORDER BY time DESC
LIMIT 10;
```
+实际生成的表名、tag 列和 field 列以当前 IoTDB `/metrics` 端点暴露的 metric family 和 labels 为准。
+
## 注意事项
- 目标端点返回内容需要符合 Prometheus text exposition sample 格式。