wu-sheng commented on code in PR #11966: URL: https://github.com/apache/skywalking/pull/11966#discussion_r1510095285
########## docs/en/setup/backend/backend-clickhouse-monitoring.md: ########## @@ -0,0 +1,134 @@ +# ClickHouse monitoring + +## ClickHouse server performance from built-in metrics data + +SkyWalking leverages ClickHouse built-in metrics data since v20.1.2.4. It leverages OpenTelemetry Collector to transfer +the metrics to +[OpenTelemetry receiver](opentelemetry-receiver.md) and into the [Meter System](./../../concepts-and-designs/meter.md). + +### Data flow + +1. Configure ClickHouse to expose metrics for Prometheus Receiver. +2. OpenTelemetry Collector fetches metrics from Prometheus Receiver and pushes metrics to SkyWalking OAP Server via + OpenTelemetry gRPC exporter. +3. The SkyWalking OAP Server parses the expression with [MAL](../../concepts-and-designs/mal.md) to + filter/calculate/aggregate and store the results. + +### Set up + +1. Set + up [built-in prometheus endpoint](https://clickhouse.com/docs/en/operations/server-configuration-parameters/settings#server_configuration_parameters-prometheus) + . +2. Set up [OpenTelemetry Collector ](https://opentelemetry.io/docs/collector/getting-started/#docker). For details on + Prometheus Receiver in OpenTelemetry Collector, refer + to [here](../../../../test/e2e-v2/cases/mysql/prometheus-mysql-exporter/otel-collector-config.yaml). +3. Config SkyWalking [OpenTelemetry receiver](opentelemetry-receiver.md). + +### ClickHouse Monitoring + +ClickHouse monitoring provides monitoring of the metrics 、events and asynchronous_metrics of the ClickHouse server. +ClickHouse cluster is cataloged as a `Layer: CLICKHOUSE` `Service` in OAP. Each ClickHouse server is cataloged as +an `Instance` in OAP. + +#### ClickHouse Instance Supported Metrics + +| Monitoring Panel | Unit | Metric Name | Description | Data Source | +| ---------------- | ---------- | ------------------------------------------ | ---------------------------------------------------------------------------------------------------------------- | ----------- | +| CpuUsage | count | meter_clickhouse_instance_cpu_usage | CPU time spent seen by OS per second(according to ClickHouse.system.dashboard.CPU Usage (cores)). | ClickHouse | +| MemoryUsage | percentage | meter_clickhouse_instance_memory_usage | Total amount of memory (bytes) allocated by the server/ total amount of OS memory. | ClickHouse | +| MemoryAvailable | percentage | meter_clickhouse_instance_memory_available | Total amount of memory (bytes) available for program / total amount of OS memory. | ClickHouse | +| Uptime | sec | meter_clickhouse_instance_uptime | The server uptime in seconds. It includes the time spent for server initialization before accepting connections. | ClickHouse | +| Version | string | meter_clickhouse_instance_version | Version of the server in a single integer number in base-1000. | ClickHouse | +| FileOpen | count | meter_clickhouse_instance_file_open | Number of files opened. | ClickHouse | Review Comment: If there is no way to add the numbers cross instances, this should be a labeled value and use mqe to merge them or show as a labeled value metric or it should be an instance level metric. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
