This is an automated email from the ASF dual-hosted git repository.
kassiez pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new a2832975450 add docs for observability trace (#2509)
a2832975450 is described below
commit a283297545091e19f5f167bd840100247627fac6
Author: Kang <[email protected]>
AuthorDate: Thu Jun 26 19:06:18 2025 +0800
add docs for observability trace (#2509)
## Versions
- [x] dev
- [x] 3.0
- [x] 2.1
- [ ] 2.0
## Languages
- [x] Chinese
- [x] English
## Docs Checklist
- [x] Checked by AI
- [ ] Test Cases Built
---
docs/observability/trace.md | 232 ++++++++++++++++++++
.../current/observability/log.md | 21 +-
.../current/observability/trace.md | 234 +++++++++++++++++++++
.../version-2.1/observability/trace.md | 234 +++++++++++++++++++++
.../version-3.0/observability/trace.md | 234 +++++++++++++++++++++
sidebars.json | 3 +-
static/images/observability/trace-detail.png | Bin 0 -> 301568 bytes
static/images/observability/trace-list.png | Bin 0 -> 483362 bytes
versioned_docs/version-2.1/observability/trace.md | 233 ++++++++++++++++++++
versioned_docs/version-3.0/observability/trace.md | 233 ++++++++++++++++++++
versioned_sidebars/version-2.1-sidebars.json | 3 +-
versioned_sidebars/version-3.0-sidebars.json | 3 +-
12 files changed, 1426 insertions(+), 4 deletions(-)
diff --git a/docs/observability/trace.md b/docs/observability/trace.md
new file mode 100644
index 00000000000..91129a5dfaf
--- /dev/null
+++ b/docs/observability/trace.md
@@ -0,0 +1,232 @@
+---
+{
+ "title": "Trace",
+ "language": "en"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Trace
+
+This article introduces the storage and analysis practices of Trace, one of
the core observability data. For an overview of the complete observability
solution, please refer to [Overview](./overview.mdx). For resource evaluation,
cluster deployment, and optimization, please refer to [Log](./log.md).
+
+## 1. Table Creation
+
+Trace data has distinct characteristics in terms of writing and querying
patterns. Targeted configurations during table creation can significantly
improve performance. Create your table based on the key guidelines below:
+
+**Partitioning and Sorting**
+- Use RANGE partitioning on the time field, enable dynamic partitioning to
manage partitions automatically by day.
+- Use `service_name` and a time field of type DATETIME as keys; this provides
multiple times acceleration when querying traces for a specific service over a
certain period.
+
+**Bucketing**
+- The number of buckets should be approximately three times the total number
of disks in the cluster.
+- Use the RANDOM bucketing strategy. Combined with single-tablet ingestion
during writes, it improves batch write efficiency.
+
+**Compaction**
+- Use the time_series compaction strategy to reduce write amplification, which
is crucial for optimizing resources under high-throughput ingestion.
+
+**VARIANT Data Type**
+- Use the semi-structured VARIANT data type for extended Trace fields like
`span_attributes` and `resource_attributes`. This automatically splits JSON
data into sub-columns for storage, improving compression rates and reducing
storage space while also enhancing filtering and sub-column analysis
performance.
+
+**Indexing**
+- Build indexes on frequently queried fields.
+- For fields requiring full-text search, specify the parser parameter. Unicode
tokenization generally meets most needs. Enable the `support_phrase` option to
support phrase queries. If not needed, set it to false to reduce storage usage.
+
+**Storage**
+- For hot data, configure 1 replica if using cloud disks or at least 2
replicas if using physical disks.
+- Use hot-cold tiered storage configuration with `log_s3` object storage and
`log_policy_3day` policy to move data older than 3 days to S3.
+
+```sql
+CREATE DATABASE log_db;
+USE log_db;
+
+-- Not required for compute-storage separation mode
+CREATE RESOURCE "log_s3"
+PROPERTIES
+(
+ "type" = "s3",
+ "s3.endpoint" = "your_endpoint_url",
+ "s3.region" = "your_region",
+ "s3.bucket" = "your_bucket",
+ "s3.root.path" = "your_path",
+ "s3.access_key" = "your_ak",
+ "s3.secret_key" = "your_sk"
+);
+
+-- Not required for compute-storage separation mode
+CREATE STORAGE POLICY log_policy_3day
+PROPERTIES(
+ "storage_resource" = "log_s3",
+ "cooldown_ttl" = "259200"
+);
+
+CREATE TABLE trace_table
+(
+ service_name VARCHAR(200),
+ timestamp DATETIME(6),
+ service_instance_id VARCHAR(200),
+ trace_id VARCHAR(200),
+ span_id STRING,
+ trace_state STRING,
+ parent_span_id STRING,
+ span_name STRING,
+ span_kind STRING,
+ end_time DATETIME(6),
+ duration BIGINT,
+ span_attributes VARIANT,
+ events ARRAY<STRUCT<timestamp:DATETIME(6), name:STRING,
attributes:MAP<STRING, STRING>>>,
+ links ARRAY<STRUCT<trace_id:STRING, span_id:STRING,
trace_state:STRING, attributes:MAP<STRING, STRING>>>,
+ status_message STRING,
+ status_code STRING,
+ resource_attributes VARIANT,
+ scope_name STRING,
+ scope_version STRING,
+ INDEX idx_timestamp(timestamp) USING INVERTED,
+ INDEX idx_service_instance_id(service_instance_id) USING INVERTED,
+ INDEX idx_trace_id(trace_id) USING INVERTED,
+ INDEX idx_span_id(span_id) USING INVERTED,
+ INDEX idx_trace_state(trace_state) USING INVERTED,
+ INDEX idx_parent_span_id(parent_span_id) USING INVERTED,
+ INDEX idx_span_name(span_name) USING INVERTED,
+ INDEX idx_span_kind(span_kind) USING INVERTED,
+ INDEX idx_end_time(end_time) USING INVERTED,
+ INDEX idx_duration(duration) USING INVERTED,
+ INDEX idx_span_attributes(span_attributes) USING INVERTED,
+ INDEX idx_status_message(status_message) USING INVERTED,
+ INDEX idx_status_code(status_code) USING INVERTED,
+ INDEX idx_resource_attributes(resource_attributes) USING INVERTED,
+ INDEX idx_scope_name(scope_name) USING INVERTED,
+ INDEX idx_scope_version(scope_version) USING INVERTED
+)
+ENGINE = OLAP
+DUPLICATE KEY(service_name, timestamp)
+PARTITION BY RANGE(timestamp) ()
+DISTRIBUTED BY RANDOM BUCKETS 250
+PROPERTIES (
+"compression" = "zstd",
+"compaction_policy" = "time_series",
+"inverted_index_storage_format" = "V2",
+"dynamic_partition.enable" = "true",
+"dynamic_partition.create_history_partition" = "true",
+"dynamic_partition.time_unit" = "DAY",
+"dynamic_partition.start" = "-30",
+"dynamic_partition.end" = "1",
+"dynamic_partition.prefix" = "p",
+"dynamic_partition.buckets" = "250",
+"dynamic_partition.replication_num" = "2", -- Not required for compute-storage
separation
+"replication_num" = "2", -- Not required for compute-storage separation
+"storage_policy" = "log_policy_3day" -- Not required for compute-storage
separation
+);
+```
+
+## 2. Trace Collection
+
+Doris provides open and general-purpose Stream HTTP APIs that can integrate
with Trace collection systems like OpenTelemetry.
+
+### OpenTelemetry Integration
+
+1. **Application-side Integration with OpenTelemetry SDK**
+
+Here we use a Spring Boot example application integrated with the
OpenTelemetry Java SDK. The example application comes from the official
[demo](https://docs.spring.io/spring-boot/tutorial/first-application/index.html),
which returns a simple "Hello World!" string for requests to the path "/".
+Download the [OpenTelemetry Java
Agent](https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases).
The advantage of using the Java Agent is that no modifications are needed to
existing application. For other languages and integration methods, see the
OpenTelemetry official website [Language APIs &
SDKs](https://opentelemetry.io/docs/languages/) or [Zero-code
Instrumentation](https://opentelemetry.io/docs/zero-code/).
+
+1. **Deploy and Configure OpenTelemetry Collector**
+
+Download and extract [OpenTelemetry
Collector](https://github.com/open-telemetry/opentelemetry-collector-releases/releases).
You need to download the package starting with "otelcol-contrib", which
includes the Doris Exporter.
+
+Create the `otel_demo.yaml` configuration file as follows. For more details,
refer to the Doris Exporter
[documentation](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/dorisexporter).
+
+```yaml
+receivers:
+ otlp: # OTLP protocol, receiving data sent by the OpenTelemetry Java Agent
+ protocols:
+ grpc:
+ endpoint: 0.0.0.0:4317
+ http:
+ endpoint: 0.0.0.0:4318
+
+processors:
+ batch:
+ send_batch_size: 100000 # Number of records per batch; recommended batch
size between 100MB-1GB
+ timeout: 10s
+
+exporters:
+ doris:
+ endpoint: http://localhost:8030 # FE HTTP address
+ database: doris_db_name
+ username: doris_username
+ password: doris_password
+ table:
+ traces: doris_table_name
+ create_schema: true # Whether to auto-create schema; manual table creation
is needed if set to false
+ mysql_endpoint: localhost:9030 # FE MySQL address
+ history_days: 10
+ create_history_days: 10
+ timezone: Asia/Shanghai
+ timeout: 60s # Timeout for HTTP stream load client
+ log_response: true
+ sending_queue:
+ enabled: true
+ num_consumers: 20
+ queue_size: 1000
+ retry_on_failure:
+ enabled: true
+ initial_interval: 5s
+ max_interval: 30s
+ headers:
+ load_to_single_tablet: "true"
+```
+
+1. **Run OpenTelemetry Collector**
+
+```bash
+./otelcol-contrib --config otel_demo.yaml
+```
+
+4. **Start the Spring Boot Example Application**
+
+Before starting the application, simply add a few environment variables
without modifying any code.
+
+```bash
+export JAVA_TOOL_OPTIONS="${JAVA_TOOL_OPTIONS}
-javaagent:/your/path/to/opentelemetry-javaagent.jar" # Path to OpenTelemetry
Java Agent
+export OTEL_JAVAAGENT_LOGGING="none" # Disable Otel logs to prevent
interference with application logs
+export OTEL_SERVICE_NAME="myproject"
+export OTEL_TRACES_EXPORTER="otlp" # Send trace data using OTLP protocol
+export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" # Address of the
OpenTelemetry Collector
+
+java -jar myproject-0.0.1-SNAPSHOT.jar
+```
+
+5. **Access the Spring Boot Example Service to Generate Trace Data**
+
+Running `curl localhost:8080` will trigger a call to the `hello` service. The
OpenTelemetry Java Agent will automatically generate Trace data and send it to
the OpenTelemetry Collector, which then writes the Trace data to the Doris
table (default is `otel.otel_traces`) via the configured Doris Exporter.
+
+## 3. Trace Querying
+
+Trace querying typically uses visual query interfaces such as Grafana.
+
+- Filter by time range and service name to display Trace summaries, including
latency distribution charts and detailed individual Traces.
+
+ 
+
+- Click on the link to view the Trace detail.
+
+ 
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/observability/log.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/observability/log.md
index 85e325833d1..ee801ae0c82 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/observability/log.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/observability/log.md
@@ -5,7 +5,26 @@
}
---
-本文介绍可观测性核心数之一 Log 的存储和分析实践,可观测性整体方案介绍请参考[概述](overview)。
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+本文介绍可观测性核心数据之一 Log 的存储和分析实践,可观测性整体方案介绍请参考[概述](overview)。
## 第 1 步:评估资源
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/observability/trace.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/observability/trace.md
new file mode 100644
index 00000000000..7394af936bc
--- /dev/null
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/observability/trace.md
@@ -0,0 +1,234 @@
+---
+{
+ "title": "Trace",
+ "language": "zh-CN"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Trace
+
+本文介绍可观测性核心数据之一 Trace
的存储分析实践,可观测性整体方案介绍请参考[概述](./overview.mdx),资源评估、集群部署和优化可以参考 [Log](./log.md)。
+
+
+## 1. 建表
+
+Trace 数据的写入和查询模式有明显的特征,在建表时进行针对性的配置会有更好的性能表现。参考下面的关键说明创建表:
+
+**分区和排序**
+- 分区使用时间字段上的 RANGE 分区,开启动态 Partition 按天自动管理分区
+- 使用 service_name 和 DATETIME 类型的时间字段作为 Key,在查询指定 service 一段时间的 Trace 时有数倍加速
+
+**分桶**
+- 分桶个数大致是集群磁盘总数的 3 倍
+- 分桶策略使用 RANDOM,配合写入时的 single tablet 导入可以提升写入 batch 效果
+
+**compaction**
+- 使用 time_series compaction 策略减少写放大,对于高吞吐 Trace 写入的资源优化很重要
+
+**VARIANT 数据类型**
+- 对于 Trace 扩展字段比如 span_attributes 和 resource_attributes 使用半结构化数据类型 VARIANT,自动将
JSON 数据拆分成子列存储,提升压缩率降低存储空间,提升过滤和分析子列的性能
+
+**索引**
+- 对经常查询的字段建索引
+- 需要全文检索的字段指定分词器 parser 参数,unicode 分词一般能满足绝大多数需求,开启 support_phrase
选项以支持短语查询,如果不需要可以设置为 false 降低存储空间
+
+**存储**
+- 热存数据,如果使用云盘可以配置 1 副本,如果使用物理盘至少配置 2 副本
+- 使用冷热分离配置 log_s3 对象存储和 log_policy_3day 超过 3 天转存 s3 策略
+
+```sql
+CREATE DATABASE log_db;
+USE log_db;
+
+-- 存算分离模式不需要
+CREATE RESOURCE "log_s3"
+PROPERTIES
+(
+ "type" = "s3",
+ "s3.endpoint" = "your_endpoint_url",
+ "s3.region" = "your_region",
+ "s3.bucket" = "your_bucket",
+ "s3.root.path" = "your_path",
+ "s3.access_key" = "your_ak",
+ "s3.secret_key" = "your_sk"
+);
+
+-- 存算分离模式不需要
+CREATE STORAGE POLICY log_policy_3day
+PROPERTIES(
+ "storage_resource" = "log_s3",
+ "cooldown_ttl" = "259200"
+);
+
+CREATE TABLE trace_table
+(
+ service_name VARCHAR(200),
+ timestamp DATETIME(6),
+ service_instance_id VARCHAR(200),
+ trace_id VARCHAR(200),
+ span_id STRING,
+ trace_state STRING,
+ parent_span_id STRING,
+ span_name STRING,
+ span_kind STRING,
+ end_time DATETIME(6),
+ duration BIGINT,
+ span_attributes VARIANT,
+ events ARRAY<STRUCT<timestamp:DATETIME(6), name:STRING,
attributes:MAP<STRING, STRING>>>,
+ links ARRAY<STRUCT<trace_id:STRING, span_id:STRING,
trace_state:STRING, attributes:MAP<STRING, STRING>>>,
+ status_message STRING,
+ status_code STRING,
+ resource_attributes VARIANT,
+ scope_name STRING,
+ scope_version STRING,
+ INDEX idx_timestamp(timestamp) USING INVERTED,
+ INDEX idx_service_instance_id(service_instance_id) USING INVERTED,
+ INDEX idx_trace_id(trace_id) USING INVERTED,
+ INDEX idx_span_id(span_id) USING INVERTED,
+ INDEX idx_trace_state(trace_state) USING INVERTED,
+ INDEX idx_parent_span_id(parent_span_id) USING INVERTED,
+ INDEX idx_span_name(span_name) USING INVERTED,
+ INDEX idx_span_kind(span_kind) USING INVERTED,
+ INDEX idx_end_time(end_time) USING INVERTED,
+ INDEX idx_duration(duration) USING INVERTED,
+ INDEX idx_span_attributes(span_attributes) USING INVERTED,
+ INDEX idx_status_message(status_message) USING INVERTED,
+ INDEX idx_status_code(status_code) USING INVERTED,
+ INDEX idx_resource_attributes(resource_attributes) USING INVERTED,
+ INDEX idx_scope_name(scope_name) USING INVERTED,
+ INDEX idx_scope_version(scope_version) USING INVERTED
+)
+ENGINE = OLAP
+DUPLICATE KEY(service_name, timestamp)
+PARTITION BY RANGE(timestamp) ()
+DISTRIBUTED BY RANDOM BUCKETS 250
+PROPERTIES (
+"compression" = "zstd",
+"compaction_policy" = "time_series",
+"inverted_index_storage_format" = "V2",
+"dynamic_partition.enable" = "true",
+"dynamic_partition.create_history_partition" = "true",
+"dynamic_partition.time_unit" = "DAY",
+"dynamic_partition.start" = "-30",
+"dynamic_partition.end" = "1",
+"dynamic_partition.prefix" = "p",
+"dynamic_partition.buckets" = "250",
+"dynamic_partition.replication_num" = "2", -- 存算分离不需要
+"replication_num" = "2" -- 存算分离不需要
+"storage_policy" = "log_policy_3day" -- 存算分离不需要
+);
+```
+
+## 2. Trace 采集
+
+Doris 提供开放通用的 Stream HTTP API,可以与 OpenTelemetry 等 Trace 采集系统打通。
+
+### OpenTelemetry 对接
+
+1. 应用侧接入 OpenTelemetry SDK
+
+这里我们使用一个 Spring Boot 示例应用接入 OpenTelemetry Java SDK,示例应用来自官方
[demo](https://docs.spring.io/spring-boot/tutorial/first-application/index.html),对路径
"/" 返回简单的 "Hello World!" 字符串。
+下载 [OpenTelemetry Java
Agent](https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases),使用
Java Agent 的优势在于无需对现有的应用做任何的修改。其他语言及其他接入方式详见 OpenTelemetry 官网:[Language APIs &
SDKs](https://opentelemetry.io/docs/languages/) 或 [Zero-code
Instrumentation](https://opentelemetry.io/docs/zero-code/)。
+
+2. 部署配置 OpenTelemetry Collector
+
+下载 [OpenTelemetry
Collector](https://github.com/open-telemetry/opentelemetry-collector-releases/releases)
并解压。需要下载以 "otelcol-contrib" 为前缀的包,其中的 Doris Exporter 组件能够把 trace 数据导入到 Doris 中。
+
+创建 `otel_demo.yaml` 配置文件如下,更多配置详见 Doris Exporter
[文档](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/dorisexporter)。
+
+```yaml
+receivers:
+ otlp: # otlp 协议,接收 OpenTelemetry Java Agent 发送的数据
+ protocols:
+ grpc:
+ endpoint: 0.0.0.0:4317
+ http:
+ endpoint: 0.0.0.0:4318
+
+processors:
+ batch:
+ send_batch_size: 100000 # 每个批次的数据条数,建议 batch 的数据量在 100M-1G 之间
+ timeout: 10s
+
+exporters:
+ doris:
+ endpoint: http://localhost:8030 # FE HTTP 地址
+ database: doris_db_name
+ username: doris_username
+ password: doris_password
+ table:
+ traces: doris_table_name
+ create_schema: true # 是否自动创建 schema,如果设置为 false,则需要手动建表
+ mysql_endpoint: localhost:9030 # FE MySQL 地址
+ history_days: 10
+ create_history_days: 10
+ timezone: Asia/Shanghai
+ timeout: 60s # http stream load 客户端超时时间
+ log_response: true
+ sending_queue:
+ enabled: true
+ num_consumers: 20
+ queue_size: 1000
+ retry_on_failure:
+ enabled: true
+ initial_interval: 5s
+ max_interval: 30s
+ headers:
+ load_to_single_tablet: "true"
+```
+
+3. 运行 OpenTelemetry Collector
+
+ ```Bash
+ ./otelcol-contrib --config otel_demo.yaml
+ ```
+
+4. 启动 Spring Boot 示例应用
+
+在启动应用之前只需要添加几个环境变量,无需修改任何代码。
+
+```Bash
+export JAVA_TOOL_OPTIONS="${JAVA_TOOL_OPTIONS}
-javaagent:/your/path/to/opentelemetry-javaagent.jar" # OpenTelemetry Java
Agent 的路径
+export OTEL_JAVAAGENT_LOGGING="none" # 禁用 otel log,防止干扰服务本身的日志
+export OTEL_SERVICE_NAME="myproject"
+export OTEL_TRACES_EXPORTER="otlp" # 使用 otlp 协议发送 trace 数据
+export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" # OpenTelemetry
Collector 的地址
+
+java -jar myproject-0.0.1-SNAPSHOT.jar
+```
+
+5. 访问 Spring Boot 示例应用产生 Trace 数据
+
+`curl loalhost:8080` 会触发 `hello` 服务调用,OpenTelemetry Java Agent 会自动生成 Trace
数据,然后发送给 OpenTelemetry Collector,Collector 再通过配置的 Doris Exporter 将 Trace 数据写入
Doris 的表中(默认是 `otel.otel_traces`)。
+
+## 3. Trace 查询
+
+Trace 查询通常使用可视化的查询界面,比如 Grafana。
+
+- 通过时间段和服务名筛选,展示 Trace 概览,包括延迟分布图和最细的一些 Trace
+
+ 
+
+- 点击链接可以查看 Trace detail
+
+ 
+
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/observability/trace.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/observability/trace.md
new file mode 100644
index 00000000000..7394af936bc
--- /dev/null
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/observability/trace.md
@@ -0,0 +1,234 @@
+---
+{
+ "title": "Trace",
+ "language": "zh-CN"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Trace
+
+本文介绍可观测性核心数据之一 Trace
的存储分析实践,可观测性整体方案介绍请参考[概述](./overview.mdx),资源评估、集群部署和优化可以参考 [Log](./log.md)。
+
+
+## 1. 建表
+
+Trace 数据的写入和查询模式有明显的特征,在建表时进行针对性的配置会有更好的性能表现。参考下面的关键说明创建表:
+
+**分区和排序**
+- 分区使用时间字段上的 RANGE 分区,开启动态 Partition 按天自动管理分区
+- 使用 service_name 和 DATETIME 类型的时间字段作为 Key,在查询指定 service 一段时间的 Trace 时有数倍加速
+
+**分桶**
+- 分桶个数大致是集群磁盘总数的 3 倍
+- 分桶策略使用 RANDOM,配合写入时的 single tablet 导入可以提升写入 batch 效果
+
+**compaction**
+- 使用 time_series compaction 策略减少写放大,对于高吞吐 Trace 写入的资源优化很重要
+
+**VARIANT 数据类型**
+- 对于 Trace 扩展字段比如 span_attributes 和 resource_attributes 使用半结构化数据类型 VARIANT,自动将
JSON 数据拆分成子列存储,提升压缩率降低存储空间,提升过滤和分析子列的性能
+
+**索引**
+- 对经常查询的字段建索引
+- 需要全文检索的字段指定分词器 parser 参数,unicode 分词一般能满足绝大多数需求,开启 support_phrase
选项以支持短语查询,如果不需要可以设置为 false 降低存储空间
+
+**存储**
+- 热存数据,如果使用云盘可以配置 1 副本,如果使用物理盘至少配置 2 副本
+- 使用冷热分离配置 log_s3 对象存储和 log_policy_3day 超过 3 天转存 s3 策略
+
+```sql
+CREATE DATABASE log_db;
+USE log_db;
+
+-- 存算分离模式不需要
+CREATE RESOURCE "log_s3"
+PROPERTIES
+(
+ "type" = "s3",
+ "s3.endpoint" = "your_endpoint_url",
+ "s3.region" = "your_region",
+ "s3.bucket" = "your_bucket",
+ "s3.root.path" = "your_path",
+ "s3.access_key" = "your_ak",
+ "s3.secret_key" = "your_sk"
+);
+
+-- 存算分离模式不需要
+CREATE STORAGE POLICY log_policy_3day
+PROPERTIES(
+ "storage_resource" = "log_s3",
+ "cooldown_ttl" = "259200"
+);
+
+CREATE TABLE trace_table
+(
+ service_name VARCHAR(200),
+ timestamp DATETIME(6),
+ service_instance_id VARCHAR(200),
+ trace_id VARCHAR(200),
+ span_id STRING,
+ trace_state STRING,
+ parent_span_id STRING,
+ span_name STRING,
+ span_kind STRING,
+ end_time DATETIME(6),
+ duration BIGINT,
+ span_attributes VARIANT,
+ events ARRAY<STRUCT<timestamp:DATETIME(6), name:STRING,
attributes:MAP<STRING, STRING>>>,
+ links ARRAY<STRUCT<trace_id:STRING, span_id:STRING,
trace_state:STRING, attributes:MAP<STRING, STRING>>>,
+ status_message STRING,
+ status_code STRING,
+ resource_attributes VARIANT,
+ scope_name STRING,
+ scope_version STRING,
+ INDEX idx_timestamp(timestamp) USING INVERTED,
+ INDEX idx_service_instance_id(service_instance_id) USING INVERTED,
+ INDEX idx_trace_id(trace_id) USING INVERTED,
+ INDEX idx_span_id(span_id) USING INVERTED,
+ INDEX idx_trace_state(trace_state) USING INVERTED,
+ INDEX idx_parent_span_id(parent_span_id) USING INVERTED,
+ INDEX idx_span_name(span_name) USING INVERTED,
+ INDEX idx_span_kind(span_kind) USING INVERTED,
+ INDEX idx_end_time(end_time) USING INVERTED,
+ INDEX idx_duration(duration) USING INVERTED,
+ INDEX idx_span_attributes(span_attributes) USING INVERTED,
+ INDEX idx_status_message(status_message) USING INVERTED,
+ INDEX idx_status_code(status_code) USING INVERTED,
+ INDEX idx_resource_attributes(resource_attributes) USING INVERTED,
+ INDEX idx_scope_name(scope_name) USING INVERTED,
+ INDEX idx_scope_version(scope_version) USING INVERTED
+)
+ENGINE = OLAP
+DUPLICATE KEY(service_name, timestamp)
+PARTITION BY RANGE(timestamp) ()
+DISTRIBUTED BY RANDOM BUCKETS 250
+PROPERTIES (
+"compression" = "zstd",
+"compaction_policy" = "time_series",
+"inverted_index_storage_format" = "V2",
+"dynamic_partition.enable" = "true",
+"dynamic_partition.create_history_partition" = "true",
+"dynamic_partition.time_unit" = "DAY",
+"dynamic_partition.start" = "-30",
+"dynamic_partition.end" = "1",
+"dynamic_partition.prefix" = "p",
+"dynamic_partition.buckets" = "250",
+"dynamic_partition.replication_num" = "2", -- 存算分离不需要
+"replication_num" = "2" -- 存算分离不需要
+"storage_policy" = "log_policy_3day" -- 存算分离不需要
+);
+```
+
+## 2. Trace 采集
+
+Doris 提供开放通用的 Stream HTTP API,可以与 OpenTelemetry 等 Trace 采集系统打通。
+
+### OpenTelemetry 对接
+
+1. 应用侧接入 OpenTelemetry SDK
+
+这里我们使用一个 Spring Boot 示例应用接入 OpenTelemetry Java SDK,示例应用来自官方
[demo](https://docs.spring.io/spring-boot/tutorial/first-application/index.html),对路径
"/" 返回简单的 "Hello World!" 字符串。
+下载 [OpenTelemetry Java
Agent](https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases),使用
Java Agent 的优势在于无需对现有的应用做任何的修改。其他语言及其他接入方式详见 OpenTelemetry 官网:[Language APIs &
SDKs](https://opentelemetry.io/docs/languages/) 或 [Zero-code
Instrumentation](https://opentelemetry.io/docs/zero-code/)。
+
+2. 部署配置 OpenTelemetry Collector
+
+下载 [OpenTelemetry
Collector](https://github.com/open-telemetry/opentelemetry-collector-releases/releases)
并解压。需要下载以 "otelcol-contrib" 为前缀的包,其中的 Doris Exporter 组件能够把 trace 数据导入到 Doris 中。
+
+创建 `otel_demo.yaml` 配置文件如下,更多配置详见 Doris Exporter
[文档](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/dorisexporter)。
+
+```yaml
+receivers:
+ otlp: # otlp 协议,接收 OpenTelemetry Java Agent 发送的数据
+ protocols:
+ grpc:
+ endpoint: 0.0.0.0:4317
+ http:
+ endpoint: 0.0.0.0:4318
+
+processors:
+ batch:
+ send_batch_size: 100000 # 每个批次的数据条数,建议 batch 的数据量在 100M-1G 之间
+ timeout: 10s
+
+exporters:
+ doris:
+ endpoint: http://localhost:8030 # FE HTTP 地址
+ database: doris_db_name
+ username: doris_username
+ password: doris_password
+ table:
+ traces: doris_table_name
+ create_schema: true # 是否自动创建 schema,如果设置为 false,则需要手动建表
+ mysql_endpoint: localhost:9030 # FE MySQL 地址
+ history_days: 10
+ create_history_days: 10
+ timezone: Asia/Shanghai
+ timeout: 60s # http stream load 客户端超时时间
+ log_response: true
+ sending_queue:
+ enabled: true
+ num_consumers: 20
+ queue_size: 1000
+ retry_on_failure:
+ enabled: true
+ initial_interval: 5s
+ max_interval: 30s
+ headers:
+ load_to_single_tablet: "true"
+```
+
+3. 运行 OpenTelemetry Collector
+
+ ```Bash
+ ./otelcol-contrib --config otel_demo.yaml
+ ```
+
+4. 启动 Spring Boot 示例应用
+
+在启动应用之前只需要添加几个环境变量,无需修改任何代码。
+
+```Bash
+export JAVA_TOOL_OPTIONS="${JAVA_TOOL_OPTIONS}
-javaagent:/your/path/to/opentelemetry-javaagent.jar" # OpenTelemetry Java
Agent 的路径
+export OTEL_JAVAAGENT_LOGGING="none" # 禁用 otel log,防止干扰服务本身的日志
+export OTEL_SERVICE_NAME="myproject"
+export OTEL_TRACES_EXPORTER="otlp" # 使用 otlp 协议发送 trace 数据
+export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" # OpenTelemetry
Collector 的地址
+
+java -jar myproject-0.0.1-SNAPSHOT.jar
+```
+
+5. 访问 Spring Boot 示例应用产生 Trace 数据
+
+`curl loalhost:8080` 会触发 `hello` 服务调用,OpenTelemetry Java Agent 会自动生成 Trace
数据,然后发送给 OpenTelemetry Collector,Collector 再通过配置的 Doris Exporter 将 Trace 数据写入
Doris 的表中(默认是 `otel.otel_traces`)。
+
+## 3. Trace 查询
+
+Trace 查询通常使用可视化的查询界面,比如 Grafana。
+
+- 通过时间段和服务名筛选,展示 Trace 概览,包括延迟分布图和最细的一些 Trace
+
+ 
+
+- 点击链接可以查看 Trace detail
+
+ 
+
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/observability/trace.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/observability/trace.md
new file mode 100644
index 00000000000..7394af936bc
--- /dev/null
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/observability/trace.md
@@ -0,0 +1,234 @@
+---
+{
+ "title": "Trace",
+ "language": "zh-CN"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Trace
+
+本文介绍可观测性核心数据之一 Trace
的存储分析实践,可观测性整体方案介绍请参考[概述](./overview.mdx),资源评估、集群部署和优化可以参考 [Log](./log.md)。
+
+
+## 1. 建表
+
+Trace 数据的写入和查询模式有明显的特征,在建表时进行针对性的配置会有更好的性能表现。参考下面的关键说明创建表:
+
+**分区和排序**
+- 分区使用时间字段上的 RANGE 分区,开启动态 Partition 按天自动管理分区
+- 使用 service_name 和 DATETIME 类型的时间字段作为 Key,在查询指定 service 一段时间的 Trace 时有数倍加速
+
+**分桶**
+- 分桶个数大致是集群磁盘总数的 3 倍
+- 分桶策略使用 RANDOM,配合写入时的 single tablet 导入可以提升写入 batch 效果
+
+**compaction**
+- 使用 time_series compaction 策略减少写放大,对于高吞吐 Trace 写入的资源优化很重要
+
+**VARIANT 数据类型**
+- 对于 Trace 扩展字段比如 span_attributes 和 resource_attributes 使用半结构化数据类型 VARIANT,自动将
JSON 数据拆分成子列存储,提升压缩率降低存储空间,提升过滤和分析子列的性能
+
+**索引**
+- 对经常查询的字段建索引
+- 需要全文检索的字段指定分词器 parser 参数,unicode 分词一般能满足绝大多数需求,开启 support_phrase
选项以支持短语查询,如果不需要可以设置为 false 降低存储空间
+
+**存储**
+- 热存数据,如果使用云盘可以配置 1 副本,如果使用物理盘至少配置 2 副本
+- 使用冷热分离配置 log_s3 对象存储和 log_policy_3day 超过 3 天转存 s3 策略
+
+```sql
+CREATE DATABASE log_db;
+USE log_db;
+
+-- 存算分离模式不需要
+CREATE RESOURCE "log_s3"
+PROPERTIES
+(
+ "type" = "s3",
+ "s3.endpoint" = "your_endpoint_url",
+ "s3.region" = "your_region",
+ "s3.bucket" = "your_bucket",
+ "s3.root.path" = "your_path",
+ "s3.access_key" = "your_ak",
+ "s3.secret_key" = "your_sk"
+);
+
+-- 存算分离模式不需要
+CREATE STORAGE POLICY log_policy_3day
+PROPERTIES(
+ "storage_resource" = "log_s3",
+ "cooldown_ttl" = "259200"
+);
+
+CREATE TABLE trace_table
+(
+ service_name VARCHAR(200),
+ timestamp DATETIME(6),
+ service_instance_id VARCHAR(200),
+ trace_id VARCHAR(200),
+ span_id STRING,
+ trace_state STRING,
+ parent_span_id STRING,
+ span_name STRING,
+ span_kind STRING,
+ end_time DATETIME(6),
+ duration BIGINT,
+ span_attributes VARIANT,
+ events ARRAY<STRUCT<timestamp:DATETIME(6), name:STRING,
attributes:MAP<STRING, STRING>>>,
+ links ARRAY<STRUCT<trace_id:STRING, span_id:STRING,
trace_state:STRING, attributes:MAP<STRING, STRING>>>,
+ status_message STRING,
+ status_code STRING,
+ resource_attributes VARIANT,
+ scope_name STRING,
+ scope_version STRING,
+ INDEX idx_timestamp(timestamp) USING INVERTED,
+ INDEX idx_service_instance_id(service_instance_id) USING INVERTED,
+ INDEX idx_trace_id(trace_id) USING INVERTED,
+ INDEX idx_span_id(span_id) USING INVERTED,
+ INDEX idx_trace_state(trace_state) USING INVERTED,
+ INDEX idx_parent_span_id(parent_span_id) USING INVERTED,
+ INDEX idx_span_name(span_name) USING INVERTED,
+ INDEX idx_span_kind(span_kind) USING INVERTED,
+ INDEX idx_end_time(end_time) USING INVERTED,
+ INDEX idx_duration(duration) USING INVERTED,
+ INDEX idx_span_attributes(span_attributes) USING INVERTED,
+ INDEX idx_status_message(status_message) USING INVERTED,
+ INDEX idx_status_code(status_code) USING INVERTED,
+ INDEX idx_resource_attributes(resource_attributes) USING INVERTED,
+ INDEX idx_scope_name(scope_name) USING INVERTED,
+ INDEX idx_scope_version(scope_version) USING INVERTED
+)
+ENGINE = OLAP
+DUPLICATE KEY(service_name, timestamp)
+PARTITION BY RANGE(timestamp) ()
+DISTRIBUTED BY RANDOM BUCKETS 250
+PROPERTIES (
+"compression" = "zstd",
+"compaction_policy" = "time_series",
+"inverted_index_storage_format" = "V2",
+"dynamic_partition.enable" = "true",
+"dynamic_partition.create_history_partition" = "true",
+"dynamic_partition.time_unit" = "DAY",
+"dynamic_partition.start" = "-30",
+"dynamic_partition.end" = "1",
+"dynamic_partition.prefix" = "p",
+"dynamic_partition.buckets" = "250",
+"dynamic_partition.replication_num" = "2", -- 存算分离不需要
+"replication_num" = "2" -- 存算分离不需要
+"storage_policy" = "log_policy_3day" -- 存算分离不需要
+);
+```
+
+## 2. Trace 采集
+
+Doris 提供开放通用的 Stream HTTP API,可以与 OpenTelemetry 等 Trace 采集系统打通。
+
+### OpenTelemetry 对接
+
+1. 应用侧接入 OpenTelemetry SDK
+
+这里我们使用一个 Spring Boot 示例应用接入 OpenTelemetry Java SDK,示例应用来自官方
[demo](https://docs.spring.io/spring-boot/tutorial/first-application/index.html),对路径
"/" 返回简单的 "Hello World!" 字符串。
+下载 [OpenTelemetry Java
Agent](https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases),使用
Java Agent 的优势在于无需对现有的应用做任何的修改。其他语言及其他接入方式详见 OpenTelemetry 官网:[Language APIs &
SDKs](https://opentelemetry.io/docs/languages/) 或 [Zero-code
Instrumentation](https://opentelemetry.io/docs/zero-code/)。
+
+2. 部署配置 OpenTelemetry Collector
+
+下载 [OpenTelemetry
Collector](https://github.com/open-telemetry/opentelemetry-collector-releases/releases)
并解压。需要下载以 "otelcol-contrib" 为前缀的包,其中的 Doris Exporter 组件能够把 trace 数据导入到 Doris 中。
+
+创建 `otel_demo.yaml` 配置文件如下,更多配置详见 Doris Exporter
[文档](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/dorisexporter)。
+
+```yaml
+receivers:
+ otlp: # otlp 协议,接收 OpenTelemetry Java Agent 发送的数据
+ protocols:
+ grpc:
+ endpoint: 0.0.0.0:4317
+ http:
+ endpoint: 0.0.0.0:4318
+
+processors:
+ batch:
+ send_batch_size: 100000 # 每个批次的数据条数,建议 batch 的数据量在 100M-1G 之间
+ timeout: 10s
+
+exporters:
+ doris:
+ endpoint: http://localhost:8030 # FE HTTP 地址
+ database: doris_db_name
+ username: doris_username
+ password: doris_password
+ table:
+ traces: doris_table_name
+ create_schema: true # 是否自动创建 schema,如果设置为 false,则需要手动建表
+ mysql_endpoint: localhost:9030 # FE MySQL 地址
+ history_days: 10
+ create_history_days: 10
+ timezone: Asia/Shanghai
+ timeout: 60s # http stream load 客户端超时时间
+ log_response: true
+ sending_queue:
+ enabled: true
+ num_consumers: 20
+ queue_size: 1000
+ retry_on_failure:
+ enabled: true
+ initial_interval: 5s
+ max_interval: 30s
+ headers:
+ load_to_single_tablet: "true"
+```
+
+3. 运行 OpenTelemetry Collector
+
+ ```Bash
+ ./otelcol-contrib --config otel_demo.yaml
+ ```
+
+4. 启动 Spring Boot 示例应用
+
+在启动应用之前只需要添加几个环境变量,无需修改任何代码。
+
+```Bash
+export JAVA_TOOL_OPTIONS="${JAVA_TOOL_OPTIONS}
-javaagent:/your/path/to/opentelemetry-javaagent.jar" # OpenTelemetry Java
Agent 的路径
+export OTEL_JAVAAGENT_LOGGING="none" # 禁用 otel log,防止干扰服务本身的日志
+export OTEL_SERVICE_NAME="myproject"
+export OTEL_TRACES_EXPORTER="otlp" # 使用 otlp 协议发送 trace 数据
+export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" # OpenTelemetry
Collector 的地址
+
+java -jar myproject-0.0.1-SNAPSHOT.jar
+```
+
+5. 访问 Spring Boot 示例应用产生 Trace 数据
+
+`curl loalhost:8080` 会触发 `hello` 服务调用,OpenTelemetry Java Agent 会自动生成 Trace
数据,然后发送给 OpenTelemetry Collector,Collector 再通过配置的 Doris Exporter 将 Trace 数据写入
Doris 的表中(默认是 `otel.otel_traces`)。
+
+## 3. Trace 查询
+
+Trace 查询通常使用可视化的查询界面,比如 Grafana。
+
+- 通过时间段和服务名筛选,展示 Trace 概览,包括延迟分布图和最细的一些 Trace
+
+ 
+
+- 点击链接可以查看 Trace detail
+
+ 
+
diff --git a/sidebars.json b/sidebars.json
index 97fd42624e4..880476e5c4b 100644
--- a/sidebars.json
+++ b/sidebars.json
@@ -482,7 +482,8 @@
"label": "Observability",
"items": [
"observability/overview",
- "observability/log"
+ "observability/log",
+ "observability/trace"
]
},
{
diff --git a/static/images/observability/trace-detail.png
b/static/images/observability/trace-detail.png
new file mode 100644
index 00000000000..674b22e2f3d
Binary files /dev/null and b/static/images/observability/trace-detail.png differ
diff --git a/static/images/observability/trace-list.png
b/static/images/observability/trace-list.png
new file mode 100644
index 00000000000..026de66d8a8
Binary files /dev/null and b/static/images/observability/trace-list.png differ
diff --git a/versioned_docs/version-2.1/observability/trace.md
b/versioned_docs/version-2.1/observability/trace.md
new file mode 100644
index 00000000000..62f243ec9a9
--- /dev/null
+++ b/versioned_docs/version-2.1/observability/trace.md
@@ -0,0 +1,233 @@
+---
+{
+ "title": "Trace",
+ "language": "en"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Trace
+
+This article introduces the storage and analysis practices of Trace, one of
the core observability data. For an overview of the complete observability
solution, please refer to [Overview](./overview.mdx). For resource evaluation,
cluster deployment, and optimization, please refer to [Log](./log.md).
+
+## 1. Table Creation
+
+Trace data has distinct characteristics in terms of writing and querying
patterns. Targeted configurations during table creation can significantly
improve performance. Create your table based on the key guidelines below:
+
+**Partitioning and Sorting**
+- Use RANGE partitioning on the time field, enable dynamic partitioning to
manage partitions automatically by day.
+- Use `service_name` and a time field of type DATETIME as keys; this provides
multiple times acceleration when querying traces for a specific service over a
certain period.
+
+**Bucketing**
+- The number of buckets should be approximately three times the total number
of disks in the cluster.
+- Use the RANDOM bucketing strategy. Combined with single-tablet ingestion
during writes, it improves batch write efficiency.
+
+**Compaction**
+- Use the time_series compaction strategy to reduce write amplification, which
is crucial for optimizing resources under high-throughput ingestion.
+
+**VARIANT Data Type**
+- Use the semi-structured VARIANT data type for extended Trace fields like
`span_attributes` and `resource_attributes`. This automatically splits JSON
data into sub-columns for storage, improving compression rates and reducing
storage space while also enhancing filtering and sub-column analysis
performance.
+
+**Indexing**
+- Build indexes on frequently queried fields.
+- For fields requiring full-text search, specify the parser parameter. Unicode
tokenization generally meets most needs. Enable the `support_phrase` option to
support phrase queries. If not needed, set it to false to reduce storage usage.
+
+**Storage**
+- For hot data, configure 1 replica if using cloud disks or at least 2
replicas if using physical disks.
+- Use hot-cold tiered storage configuration with `log_s3` object storage and
`log_policy_3day` policy to move data older than 3 days to S3.
+
+```sql
+CREATE DATABASE log_db;
+USE log_db;
+
+-- Not required for compute-storage separation mode
+CREATE RESOURCE "log_s3"
+PROPERTIES
+(
+ "type" = "s3",
+ "s3.endpoint" = "your_endpoint_url",
+ "s3.region" = "your_region",
+ "s3.bucket" = "your_bucket",
+ "s3.root.path" = "your_path",
+ "s3.access_key" = "your_ak",
+ "s3.secret_key" = "your_sk"
+);
+
+-- Not required for compute-storage separation mode
+CREATE STORAGE POLICY log_policy_3day
+PROPERTIES(
+ "storage_resource" = "log_s3",
+ "cooldown_ttl" = "259200"
+);
+
+CREATE TABLE trace_table
+(
+ service_name VARCHAR(200),
+ timestamp DATETIME(6),
+ service_instance_id VARCHAR(200),
+ trace_id VARCHAR(200),
+ span_id STRING,
+ trace_state STRING,
+ parent_span_id STRING,
+ span_name STRING,
+ span_kind STRING,
+ end_time DATETIME(6),
+ duration BIGINT,
+ span_attributes VARIANT,
+ events ARRAY<STRUCT<timestamp:DATETIME(6), name:STRING,
attributes:MAP<STRING, STRING>>>,
+ links ARRAY<STRUCT<trace_id:STRING, span_id:STRING,
trace_state:STRING, attributes:MAP<STRING, STRING>>>,
+ status_message STRING,
+ status_code STRING,
+ resource_attributes VARIANT,
+ scope_name STRING,
+ scope_version STRING,
+ INDEX idx_timestamp(timestamp) USING INVERTED,
+ INDEX idx_service_instance_id(service_instance_id) USING INVERTED,
+ INDEX idx_trace_id(trace_id) USING INVERTED,
+ INDEX idx_span_id(span_id) USING INVERTED,
+ INDEX idx_trace_state(trace_state) USING INVERTED,
+ INDEX idx_parent_span_id(parent_span_id) USING INVERTED,
+ INDEX idx_span_name(span_name) USING INVERTED,
+ INDEX idx_span_kind(span_kind) USING INVERTED,
+ INDEX idx_end_time(end_time) USING INVERTED,
+ INDEX idx_duration(duration) USING INVERTED,
+ INDEX idx_span_attributes(span_attributes) USING INVERTED,
+ INDEX idx_status_message(status_message) USING INVERTED,
+ INDEX idx_status_code(status_code) USING INVERTED,
+ INDEX idx_resource_attributes(resource_attributes) USING INVERTED,
+ INDEX idx_scope_name(scope_name) USING INVERTED,
+ INDEX idx_scope_version(scope_version) USING INVERTED
+)
+ENGINE = OLAP
+DUPLICATE KEY(service_name, timestamp)
+PARTITION BY RANGE(timestamp) ()
+DISTRIBUTED BY RANDOM BUCKETS 250
+PROPERTIES (
+"compression" = "zstd",
+"compaction_policy" = "time_series",
+"inverted_index_storage_format" = "V2",
+"dynamic_partition.enable" = "true",
+"dynamic_partition.create_history_partition" = "true",
+"dynamic_partition.time_unit" = "DAY",
+"dynamic_partition.start" = "-30",
+"dynamic_partition.end" = "1",
+"dynamic_partition.prefix" = "p",
+"dynamic_partition.buckets" = "250",
+"dynamic_partition.replication_num" = "2", -- Not required for compute-storage
separation
+"replication_num" = "2", -- Not required for compute-storage separation
+"storage_policy" = "log_policy_3day" -- Not required for compute-storage
separation
+);
+```
+
+## 2. Trace Collection
+
+Doris provides open and general-purpose Stream HTTP APIs that can integrate
with Trace collection systems like OpenTelemetry.
+
+### OpenTelemetry Integration
+
+1. **Application-side Integration with OpenTelemetry SDK**
+
+Here we use a Spring Boot example application integrated with the
OpenTelemetry Java SDK. The example application comes from the official
[demo](https://docs.spring.io/spring-boot/tutorial/first-application/index.html),
which returns a simple "Hello World!" string for requests to the path "/".
+Download the [OpenTelemetry Java
Agent](https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases).
The advantage of using the Java Agent is that no modifications are needed to
existing application. For other languages and integration methods, see the
OpenTelemetry official website [Language APIs &
SDKs](https://opentelemetry.io/docs/languages/) or [Zero-code
Instrumentation](https://opentelemetry.io/docs/zero-code/).
+
+1. **Deploy and Configure OpenTelemetry Collector**
+
+Download and extract [OpenTelemetry
Collector](https://github.com/open-telemetry/opentelemetry-collector-releases/releases).
You need to download the package starting with "otelcol-contrib", which
includes the Doris Exporter.
+
+Create the `otel_demo.yaml` configuration file as follows. For more details,
refer to the Doris Exporter
[documentation](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/dorisexporter).
+
+```yaml
+receivers:
+ otlp: # OTLP protocol, receiving data sent by the OpenTelemetry Java Agent
+ protocols:
+ grpc:
+ endpoint: 0.0.0.0:4317
+ http:
+ endpoint: 0.0.0.0:4318
+
+processors:
+ batch:
+ send_batch_size: 100000 # Number of records per batch; recommended batch
size between 100MB-1GB
+ timeout: 10s
+
+exporters:
+ doris:
+ endpoint: http://localhost:8030 # FE HTTP address
+ database: doris_db_name
+ username: doris_username
+ password: doris_password
+ table:
+ traces: doris_table_name
+ create_schema: true # Whether to auto-create schema; manual table creation
is needed if set to false
+ mysql_endpoint: localhost:9030 # FE MySQL address
+ history_days: 10
+ create_history_days: 10
+ timezone: Asia/Shanghai
+ timeout: 60s # Timeout for HTTP stream load client
+ log_response: true
+ sending_queue:
+ enabled: true
+ num_consumers: 20
+ queue_size: 1000
+ retry_on_failure:
+ enabled: true
+ initial_interval: 5s
+ max_interval: 30s
+ headers:
+ load_to_single_tablet: "true"
+```
+
+1. **Run OpenTelemetry Collector**
+
+```bash
+./otelcol-contrib --config otel_demo.yaml
+```
+
+4. **Start the Spring Boot Example Application**
+
+Before starting the application, simply add a few environment variables
without modifying any code.
+
+```bash
+export JAVA_TOOL_OPTIONS="${JAVA_TOOL_OPTIONS}
-javaagent:/your/path/to/opentelemetry-javaagent.jar" # Path to OpenTelemetry
Java Agent
+export OTEL_JAVAAGENT_LOGGING="none" # Disable Otel logs to prevent
interference with application logs
+export OTEL_SERVICE_NAME="myproject"
+export OTEL_TRACES_EXPORTER="otlp" # Send trace data using OTLP protocol
+export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" # Address of the
OpenTelemetry Collector
+
+java -jar myproject-0.0.1-SNAPSHOT.jar
+```
+
+5. **Access the Spring Boot Example Service to Generate Trace Data**
+
+Running `curl localhost:8080` will trigger a call to the `hello` service. The
OpenTelemetry Java Agent will automatically generate Trace data and send it to
the OpenTelemetry Collector, which then writes the Trace data to the Doris
table (default is `otel.otel_traces`) via the configured Doris Exporter.
+
+## 3. Trace Querying
+
+Trace querying typically uses visual query interfaces such as Grafana.
+
+- Filter by time range and service name to display Trace summaries, including
latency distribution charts and detailed individual Traces.
+
+ 
+
+- Click on the link to view the Trace detail.
+
+ 
+
diff --git a/versioned_docs/version-3.0/observability/trace.md
b/versioned_docs/version-3.0/observability/trace.md
new file mode 100644
index 00000000000..62f243ec9a9
--- /dev/null
+++ b/versioned_docs/version-3.0/observability/trace.md
@@ -0,0 +1,233 @@
+---
+{
+ "title": "Trace",
+ "language": "en"
+}
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Trace
+
+This article introduces the storage and analysis practices of Trace, one of
the core observability data. For an overview of the complete observability
solution, please refer to [Overview](./overview.mdx). For resource evaluation,
cluster deployment, and optimization, please refer to [Log](./log.md).
+
+## 1. Table Creation
+
+Trace data has distinct characteristics in terms of writing and querying
patterns. Targeted configurations during table creation can significantly
improve performance. Create your table based on the key guidelines below:
+
+**Partitioning and Sorting**
+- Use RANGE partitioning on the time field, enable dynamic partitioning to
manage partitions automatically by day.
+- Use `service_name` and a time field of type DATETIME as keys; this provides
multiple times acceleration when querying traces for a specific service over a
certain period.
+
+**Bucketing**
+- The number of buckets should be approximately three times the total number
of disks in the cluster.
+- Use the RANDOM bucketing strategy. Combined with single-tablet ingestion
during writes, it improves batch write efficiency.
+
+**Compaction**
+- Use the time_series compaction strategy to reduce write amplification, which
is crucial for optimizing resources under high-throughput ingestion.
+
+**VARIANT Data Type**
+- Use the semi-structured VARIANT data type for extended Trace fields like
`span_attributes` and `resource_attributes`. This automatically splits JSON
data into sub-columns for storage, improving compression rates and reducing
storage space while also enhancing filtering and sub-column analysis
performance.
+
+**Indexing**
+- Build indexes on frequently queried fields.
+- For fields requiring full-text search, specify the parser parameter. Unicode
tokenization generally meets most needs. Enable the `support_phrase` option to
support phrase queries. If not needed, set it to false to reduce storage usage.
+
+**Storage**
+- For hot data, configure 1 replica if using cloud disks or at least 2
replicas if using physical disks.
+- Use hot-cold tiered storage configuration with `log_s3` object storage and
`log_policy_3day` policy to move data older than 3 days to S3.
+
+```sql
+CREATE DATABASE log_db;
+USE log_db;
+
+-- Not required for compute-storage separation mode
+CREATE RESOURCE "log_s3"
+PROPERTIES
+(
+ "type" = "s3",
+ "s3.endpoint" = "your_endpoint_url",
+ "s3.region" = "your_region",
+ "s3.bucket" = "your_bucket",
+ "s3.root.path" = "your_path",
+ "s3.access_key" = "your_ak",
+ "s3.secret_key" = "your_sk"
+);
+
+-- Not required for compute-storage separation mode
+CREATE STORAGE POLICY log_policy_3day
+PROPERTIES(
+ "storage_resource" = "log_s3",
+ "cooldown_ttl" = "259200"
+);
+
+CREATE TABLE trace_table
+(
+ service_name VARCHAR(200),
+ timestamp DATETIME(6),
+ service_instance_id VARCHAR(200),
+ trace_id VARCHAR(200),
+ span_id STRING,
+ trace_state STRING,
+ parent_span_id STRING,
+ span_name STRING,
+ span_kind STRING,
+ end_time DATETIME(6),
+ duration BIGINT,
+ span_attributes VARIANT,
+ events ARRAY<STRUCT<timestamp:DATETIME(6), name:STRING,
attributes:MAP<STRING, STRING>>>,
+ links ARRAY<STRUCT<trace_id:STRING, span_id:STRING,
trace_state:STRING, attributes:MAP<STRING, STRING>>>,
+ status_message STRING,
+ status_code STRING,
+ resource_attributes VARIANT,
+ scope_name STRING,
+ scope_version STRING,
+ INDEX idx_timestamp(timestamp) USING INVERTED,
+ INDEX idx_service_instance_id(service_instance_id) USING INVERTED,
+ INDEX idx_trace_id(trace_id) USING INVERTED,
+ INDEX idx_span_id(span_id) USING INVERTED,
+ INDEX idx_trace_state(trace_state) USING INVERTED,
+ INDEX idx_parent_span_id(parent_span_id) USING INVERTED,
+ INDEX idx_span_name(span_name) USING INVERTED,
+ INDEX idx_span_kind(span_kind) USING INVERTED,
+ INDEX idx_end_time(end_time) USING INVERTED,
+ INDEX idx_duration(duration) USING INVERTED,
+ INDEX idx_span_attributes(span_attributes) USING INVERTED,
+ INDEX idx_status_message(status_message) USING INVERTED,
+ INDEX idx_status_code(status_code) USING INVERTED,
+ INDEX idx_resource_attributes(resource_attributes) USING INVERTED,
+ INDEX idx_scope_name(scope_name) USING INVERTED,
+ INDEX idx_scope_version(scope_version) USING INVERTED
+)
+ENGINE = OLAP
+DUPLICATE KEY(service_name, timestamp)
+PARTITION BY RANGE(timestamp) ()
+DISTRIBUTED BY RANDOM BUCKETS 250
+PROPERTIES (
+"compression" = "zstd",
+"compaction_policy" = "time_series",
+"inverted_index_storage_format" = "V2",
+"dynamic_partition.enable" = "true",
+"dynamic_partition.create_history_partition" = "true",
+"dynamic_partition.time_unit" = "DAY",
+"dynamic_partition.start" = "-30",
+"dynamic_partition.end" = "1",
+"dynamic_partition.prefix" = "p",
+"dynamic_partition.buckets" = "250",
+"dynamic_partition.replication_num" = "2", -- Not required for compute-storage
separation
+"replication_num" = "2", -- Not required for compute-storage separation
+"storage_policy" = "log_policy_3day" -- Not required for compute-storage
separation
+);
+```
+
+## 2. Trace Collection
+
+Doris provides open and general-purpose Stream HTTP APIs that can integrate
with Trace collection systems like OpenTelemetry.
+
+### OpenTelemetry Integration
+
+1. **Application-side Integration with OpenTelemetry SDK**
+
+Here we use a Spring Boot example application integrated with the
OpenTelemetry Java SDK. The example application comes from the official
[demo](https://docs.spring.io/spring-boot/tutorial/first-application/index.html),
which returns a simple "Hello World!" string for requests to the path "/".
+Download the [OpenTelemetry Java
Agent](https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases).
The advantage of using the Java Agent is that no modifications are needed to
existing application. For other languages and integration methods, see the
OpenTelemetry official website [Language APIs &
SDKs](https://opentelemetry.io/docs/languages/) or [Zero-code
Instrumentation](https://opentelemetry.io/docs/zero-code/).
+
+1. **Deploy and Configure OpenTelemetry Collector**
+
+Download and extract [OpenTelemetry
Collector](https://github.com/open-telemetry/opentelemetry-collector-releases/releases).
You need to download the package starting with "otelcol-contrib", which
includes the Doris Exporter.
+
+Create the `otel_demo.yaml` configuration file as follows. For more details,
refer to the Doris Exporter
[documentation](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/dorisexporter).
+
+```yaml
+receivers:
+ otlp: # OTLP protocol, receiving data sent by the OpenTelemetry Java Agent
+ protocols:
+ grpc:
+ endpoint: 0.0.0.0:4317
+ http:
+ endpoint: 0.0.0.0:4318
+
+processors:
+ batch:
+ send_batch_size: 100000 # Number of records per batch; recommended batch
size between 100MB-1GB
+ timeout: 10s
+
+exporters:
+ doris:
+ endpoint: http://localhost:8030 # FE HTTP address
+ database: doris_db_name
+ username: doris_username
+ password: doris_password
+ table:
+ traces: doris_table_name
+ create_schema: true # Whether to auto-create schema; manual table creation
is needed if set to false
+ mysql_endpoint: localhost:9030 # FE MySQL address
+ history_days: 10
+ create_history_days: 10
+ timezone: Asia/Shanghai
+ timeout: 60s # Timeout for HTTP stream load client
+ log_response: true
+ sending_queue:
+ enabled: true
+ num_consumers: 20
+ queue_size: 1000
+ retry_on_failure:
+ enabled: true
+ initial_interval: 5s
+ max_interval: 30s
+ headers:
+ load_to_single_tablet: "true"
+```
+
+1. **Run OpenTelemetry Collector**
+
+```bash
+./otelcol-contrib --config otel_demo.yaml
+```
+
+4. **Start the Spring Boot Example Application**
+
+Before starting the application, simply add a few environment variables
without modifying any code.
+
+```bash
+export JAVA_TOOL_OPTIONS="${JAVA_TOOL_OPTIONS}
-javaagent:/your/path/to/opentelemetry-javaagent.jar" # Path to OpenTelemetry
Java Agent
+export OTEL_JAVAAGENT_LOGGING="none" # Disable Otel logs to prevent
interference with application logs
+export OTEL_SERVICE_NAME="myproject"
+export OTEL_TRACES_EXPORTER="otlp" # Send trace data using OTLP protocol
+export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" # Address of the
OpenTelemetry Collector
+
+java -jar myproject-0.0.1-SNAPSHOT.jar
+```
+
+5. **Access the Spring Boot Example Service to Generate Trace Data**
+
+Running `curl localhost:8080` will trigger a call to the `hello` service. The
OpenTelemetry Java Agent will automatically generate Trace data and send it to
the OpenTelemetry Collector, which then writes the Trace data to the Doris
table (default is `otel.otel_traces`) via the configured Doris Exporter.
+
+## 3. Trace Querying
+
+Trace querying typically uses visual query interfaces such as Grafana.
+
+- Filter by time range and service name to display Trace summaries, including
latency distribution charts and detailed individual Traces.
+
+ 
+
+- Click on the link to view the Trace detail.
+
+ 
+
diff --git a/versioned_sidebars/version-2.1-sidebars.json
b/versioned_sidebars/version-2.1-sidebars.json
index bad6e2abb98..e892c038e5c 100644
--- a/versioned_sidebars/version-2.1-sidebars.json
+++ b/versioned_sidebars/version-2.1-sidebars.json
@@ -408,7 +408,8 @@
"label": "Observability",
"items": [
"observability/overview",
- "observability/log"
+ "observability/log",
+ "observability/trace"
]
},
{
diff --git a/versioned_sidebars/version-3.0-sidebars.json
b/versioned_sidebars/version-3.0-sidebars.json
index 42e0d163b96..63adfbdfe18 100644
--- a/versioned_sidebars/version-3.0-sidebars.json
+++ b/versioned_sidebars/version-3.0-sidebars.json
@@ -439,7 +439,8 @@
"label": "Observability",
"items": [
"observability/overview",
- "observability/log"
+ "observability/log",
+ "observability/trace"
]
},
{
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]