This is an automated email from the ASF dual-hosted git repository.
eldenmoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new 1a18601fc1e shrink variant workload guide picture (#3510)
1a18601fc1e is described below
commit 1a18601fc1e9d9ce2568e61a45c44afc1fbe1680
Author: lihangyu <[email protected]>
AuthorDate: Tue Mar 31 23:11:23 2026 +0800
shrink variant workload guide picture (#3510)
---
.../semi-structured/variant-workload-guide.md | 14 +++++++-------
.../semi-structured/variant-workload-guide.md | 14 +++++++-------
.../semi-structured/variant-workload-guide.md | 8 ++++----
.../semi-structured/variant-workload-guide.md | 14 +++++++-------
.../variant/variant-decision-flowchart-3x.png | Bin 369549 -> 496735 bytes
.../images/variant/variant-decision-flowchart.png | Bin 380819 -> 482301 bytes
static/images/variant/variant-default-storage.png | Bin 388356 -> 500347 bytes
.../images/variant/variant-doc-mode-readpaths.png | Bin 415834 -> 518112 bytes
static/images/variant/variant-doc-mode.png | Bin 356585 -> 446346 bytes
static/images/variant/variant-sparse-sharding.png | Bin 449432 -> 600665 bytes
static/images/variant/variant-sparse-storage.png | Bin 403645 -> 534461 bytes
.../semi-structured/variant-workload-guide.md | 8 ++++----
.../semi-structured/variant-workload-guide.md | 14 +++++++-------
13 files changed, 36 insertions(+), 36 deletions(-)
diff --git
a/docs/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
b/docs/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
index 3dcf21b6eaa..9a5857aa770 100644
---
a/docs/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
+++
b/docs/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
@@ -64,7 +64,7 @@ Before reading the storage modes below, make sure these terms
are clear. Each is
**Subcolumnization.** When data is written into a `VARIANT` column, Doris
automatically discovers JSON paths and extracts hot paths as independent
columnar subcolumns for efficient analytics.
-
+<img src="/images/variant/variant-default-storage.png" alt="Default VARIANT:
Automatic Subcolumn Extraction" width="720" />
**Schema Template.** A declaration on a `VARIANT` column that pins selected
paths to stable types. Use it for key business fields that must stay typed,
indexable, and predictable. Do not try to enumerate every possible path.
@@ -72,23 +72,23 @@ Before reading the storage modes below, make sure these
terms are clear. Each is
**Sparse columns.** When wide JSON has a clear hot/cold split, sparse columns
keep hot paths in Subcolumnization while pushing cold (long-tail) paths into
shared sparse storage. Sparse storage supports sharding across multiple
physical columns for better read parallelism.
-
+<img src="/images/variant/variant-sparse-storage.png" alt="Sparse Columns:
Hot/Cold Path Separation" width="720" />
As shown above, hot paths (such as `user_id`, `page`) stay as independent
columnar subcolumns with full analytics speed, while thousands of long-tail
paths converge into shared sparse storage. The threshold is controlled by
`variant_max_subcolumns_count`.
**Sparse sharding.** When the long-tail path count is very large, a single
sparse column can become a read bottleneck. Sparse sharding distributes
long-tail paths by hash across multiple physical columns
(`variant_sparse_hash_shard_count`), so they can be scanned in parallel.
-
+<img src="/images/variant/variant-sparse-sharding.png" alt="Sparse Sharding:
Parallel Read for Long-Tail Paths" width="720" />
**DOC mode.** Delays Subcolumnization at write time and additionally stores
the original JSON as a map-format stored field (the **doc map**). This gives
fast ingest and efficient whole-document return at the cost of extra storage.
Subcolumnization still happens later during compaction.
-
+<img src="/images/variant/variant-doc-mode.png" alt="DOC Mode: Deferred
Extraction + Fast Document Return" width="700" />
As illustrated above, during write the JSON is preserved as-is into a Doc
Store for fast ingest. Subcolumns are extracted later during compaction. At
read time, path-based queries (e.g. `SELECT v['user_id']`) read from
materialized subcolumns at full columnar speed, while whole-document queries
(`SELECT v`) read directly from the Doc Store without reconstructing from
subcolumns.
DOC mode has three distinct read paths depending on whether the queried path
has been materialized:
-
+<img src="/images/variant/variant-doc-mode-readpaths.png" alt="DOC Mode: Read
Path Details" width="720" />
- **DOC Materialized**: The queried path has already been extracted into a
subcolumn (after compaction or when `variant_doc_materialization_min_rows` is
met). Reads at full columnar speed, same as default VARIANT.
- **DOC Map**: The queried path has not been materialized yet. The query falls
back to scanning the entire doc map to find the value — significantly slower on
wide JSON.
@@ -98,7 +98,7 @@ DOC mode has three distinct read paths depending on whether
the queried path has
## Recommended Decision Path
-
+<img src="/images/variant/variant-decision-flowchart.png" alt="VARIANT Mode
Decision Path" width="520" />
## Storage Modes
@@ -248,7 +248,7 @@ Watch for:
The chart below compares single-path extraction time on a 10K-path wide-column
dataset (200K rows, extracting one key, 16 CPUs, median of 3 runs).
-
+<img src="/images/variant/variant-bench-query-time.svg" alt="Wide-Column
Single-Path Extraction: Query Time" width="720" />
| Mode | Query Time | Peak Memory |
|---|---:|---:|
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
index 4c4f844629a..8689f8d52fe 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
@@ -64,7 +64,7 @@
**子列列式提取(Subcolumnization)。** 写入 `VARIANT` 列时,Doris 会自动发现 JSON
Path,并对热点路径执行子列列式提取,使其以独立子列的形式参与分析。
-
+<img src="/images/variant/variant-default-storage.png" alt="默认 VARIANT:自动子列提取"
width="720" />
**Schema Template。** 一种在 `VARIANT`
列上的声明,用来把部分路径固定为稳定类型。它适合少量关键业务字段,让这些路径的类型、索引和行为更可控;不应试图穷举所有可能路径。
@@ -72,23 +72,23 @@
**Sparse columns(稀疏列)。** 当宽 JSON 有明显的冷热分布时,Sparse
让热点路径继续保留子列列式提取(Subcolumnization)的结果,而冷门(长尾)路径进入共享的稀疏存储。稀疏存储支持分片,将对多个物理列进行分散存储以提升读并行度。
-
+<img src="/images/variant/variant-sparse-storage.png" alt="Sparse
Columns:冷热路径分离" width="720" />
如上图所示,热点路径(如 `user_id`、`page`)继续以独立列式子列的形式保持高性能分析能力,而数千个长尾路径则汇入共享稀疏存储。阈值通过
`variant_max_subcolumns_count` 控制。
**Sparse sharding(稀疏分片)。**
当长尾路径数量非常大时,单个稀疏列可能成为读取瓶颈。稀疏分片通过哈希将长尾路径分散到多个物理列(`variant_sparse_hash_shard_count`),从而可以并行扫描。
-
+<img src="/images/variant/variant-sparse-sharding.png" alt="Sparse
Sharding:长尾路径并行读取" width="720" />
**DOC mode。** 写入时延迟子列列式提取(Subcolumnization),并额外存储一份 map 格式的原始 JSON(即 **doc
map**)。这带来了快速导入和高效整条文档返回能力,代价是额外存储。后续 Compaction 时仍会完成 Subcolumnization。
-
+<img src="/images/variant/variant-doc-mode.png" alt="DOC Mode:延迟提取 + 快速文档返回"
width="700" />
如上图所示,写入时 JSON 被原样保存到 Doc Store 以实现快速导入。子列在后续 Compaction 过程中提取。读取时,按路径查询(如
`SELECT v['user_id']`)从物化子列中以列式速度读取;而整条文档查询(`SELECT v`)则直接从 Doc Store
中读取,无需从大量子列重组文档。
DOC mode 的读取路径取决于被查询的路径是否已经物化:
-
+<img src="/images/variant/variant-doc-mode-readpaths.png" alt="DOC
Mode:读取路径详情" width="720" />
- **DOC Materialized**:被查询的路径已经提取为 subcolumn(Compaction 后或
`variant_doc_materialization_min_rows` 条件满足后)。以列式速度读取,与默认 VARIANT 一样快。
- **DOC Map**:被查询的路径尚未物化。查询回退到扫描整个 doc map 来查找值 —— 在宽 JSON 上显著变慢。
@@ -98,7 +98,7 @@ DOC mode 的读取路径取决于被查询的路径是否已经物化:
## 推荐决策路径
-
+<img src="/images/variant/variant-decision-flowchart.png" alt="VARIANT 模式决策路径"
width="520" />
## 存储模式
@@ -248,7 +248,7 @@ PROPERTIES (
下图对比了 10K 路径宽列数据集上的单路径提取耗时(200K 行,提取 key5000,16 CPU,3 次取中位数)。
-
+<img src="/images/variant/variant-bench-query-time.svg" alt="宽列单路径提取:查询耗时"
width="720" />
| 模式 | 查询耗时 | 峰值内存 |
|---|---:|---:|
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
index 75ec43472b9..82720e3eea2 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
@@ -47,7 +47,7 @@
**子列列式提取(Subcolumnization)。** 写入 `VARIANT` 列时,Doris 会自动发现 JSON
Path,并对热点路径执行子列列式提取,使其以独立子列的形式参与分析。
-
+<img src="/images/variant/variant-default-storage.png" alt="默认 VARIANT:自动子列提取"
width="720" />
**Schema Template(3.1+)。** 一种在 `VARIANT`
列上的声明,用来把部分路径固定为稳定类型。它适合少量关键业务字段,让这些路径的类型、索引和行为更可控;不应试图穷举所有可能路径。
@@ -55,13 +55,13 @@
**Sparse columns(稀疏列,3.1+)。** 当宽 JSON 有明显的冷热分布时,Sparse
让热点路径继续保留子列列式提取(Subcolumnization)的结果,而冷门(长尾)路径进入共享的稀疏存储。使用
`variant_max_subcolumns_count` 控制边界。
-
+<img src="/images/variant/variant-sparse-storage.png" alt="Sparse
Columns:冷热路径分离" width="720" />
如上图所示,热点路径(如 `user_id`、`page`)继续以独立列式子列的形式保持高性能分析能力,而数千个长尾路径则汇入共享稀疏存储。阈值通过
`variant_max_subcolumns_count` 控制。
## 推荐决策路径
-
+<img src="/images/variant/variant-decision-flowchart-3x.png" alt="VARIANT
模式决策路径 (Doris 3.x)" width="600" />
如果宽 JSON 的主访问模式是整条文档返回,Doris 3.x 的 `VARIANT` 往往不是最佳匹配,因为没有 DOC mode。不建议在超宽列上把
`SELECT variant_col` 作为主查询模式。
@@ -173,7 +173,7 @@ PROPERTIES (
下图对比了 10K 路径宽列数据集上的单路径提取耗时(200K 行,提取 key5000,16 CPU,3 次取中位数)。
-
+<img src="/images/variant/variant-bench-query-time-3x.svg" alt="宽列单路径提取:查询耗时"
width="720" />
| 模式 | 查询耗时 | 峰值内存 |
|---|---:|---:|
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
index 4c4f844629a..8689f8d52fe 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
@@ -64,7 +64,7 @@
**子列列式提取(Subcolumnization)。** 写入 `VARIANT` 列时,Doris 会自动发现 JSON
Path,并对热点路径执行子列列式提取,使其以独立子列的形式参与分析。
-
+<img src="/images/variant/variant-default-storage.png" alt="默认 VARIANT:自动子列提取"
width="720" />
**Schema Template。** 一种在 `VARIANT`
列上的声明,用来把部分路径固定为稳定类型。它适合少量关键业务字段,让这些路径的类型、索引和行为更可控;不应试图穷举所有可能路径。
@@ -72,23 +72,23 @@
**Sparse columns(稀疏列)。** 当宽 JSON 有明显的冷热分布时,Sparse
让热点路径继续保留子列列式提取(Subcolumnization)的结果,而冷门(长尾)路径进入共享的稀疏存储。稀疏存储支持分片,将对多个物理列进行分散存储以提升读并行度。
-
+<img src="/images/variant/variant-sparse-storage.png" alt="Sparse
Columns:冷热路径分离" width="720" />
如上图所示,热点路径(如 `user_id`、`page`)继续以独立列式子列的形式保持高性能分析能力,而数千个长尾路径则汇入共享稀疏存储。阈值通过
`variant_max_subcolumns_count` 控制。
**Sparse sharding(稀疏分片)。**
当长尾路径数量非常大时,单个稀疏列可能成为读取瓶颈。稀疏分片通过哈希将长尾路径分散到多个物理列(`variant_sparse_hash_shard_count`),从而可以并行扫描。
-
+<img src="/images/variant/variant-sparse-sharding.png" alt="Sparse
Sharding:长尾路径并行读取" width="720" />
**DOC mode。** 写入时延迟子列列式提取(Subcolumnization),并额外存储一份 map 格式的原始 JSON(即 **doc
map**)。这带来了快速导入和高效整条文档返回能力,代价是额外存储。后续 Compaction 时仍会完成 Subcolumnization。
-
+<img src="/images/variant/variant-doc-mode.png" alt="DOC Mode:延迟提取 + 快速文档返回"
width="700" />
如上图所示,写入时 JSON 被原样保存到 Doc Store 以实现快速导入。子列在后续 Compaction 过程中提取。读取时,按路径查询(如
`SELECT v['user_id']`)从物化子列中以列式速度读取;而整条文档查询(`SELECT v`)则直接从 Doc Store
中读取,无需从大量子列重组文档。
DOC mode 的读取路径取决于被查询的路径是否已经物化:
-
+<img src="/images/variant/variant-doc-mode-readpaths.png" alt="DOC
Mode:读取路径详情" width="720" />
- **DOC Materialized**:被查询的路径已经提取为 subcolumn(Compaction 后或
`variant_doc_materialization_min_rows` 条件满足后)。以列式速度读取,与默认 VARIANT 一样快。
- **DOC Map**:被查询的路径尚未物化。查询回退到扫描整个 doc map 来查找值 —— 在宽 JSON 上显著变慢。
@@ -98,7 +98,7 @@ DOC mode 的读取路径取决于被查询的路径是否已经物化:
## 推荐决策路径
-
+<img src="/images/variant/variant-decision-flowchart.png" alt="VARIANT 模式决策路径"
width="520" />
## 存储模式
@@ -248,7 +248,7 @@ PROPERTIES (
下图对比了 10K 路径宽列数据集上的单路径提取耗时(200K 行,提取 key5000,16 CPU,3 次取中位数)。
-
+<img src="/images/variant/variant-bench-query-time.svg" alt="宽列单路径提取:查询耗时"
width="720" />
| 模式 | 查询耗时 | 峰值内存 |
|---|---:|---:|
diff --git a/static/images/variant/variant-decision-flowchart-3x.png
b/static/images/variant/variant-decision-flowchart-3x.png
index 62fe8e7ee1f..b71dd5d46d4 100644
Binary files a/static/images/variant/variant-decision-flowchart-3x.png and
b/static/images/variant/variant-decision-flowchart-3x.png differ
diff --git a/static/images/variant/variant-decision-flowchart.png
b/static/images/variant/variant-decision-flowchart.png
index 92de25e7964..e36dfff9321 100644
Binary files a/static/images/variant/variant-decision-flowchart.png and
b/static/images/variant/variant-decision-flowchart.png differ
diff --git a/static/images/variant/variant-default-storage.png
b/static/images/variant/variant-default-storage.png
index 849627396c1..f7efeae744f 100644
Binary files a/static/images/variant/variant-default-storage.png and
b/static/images/variant/variant-default-storage.png differ
diff --git a/static/images/variant/variant-doc-mode-readpaths.png
b/static/images/variant/variant-doc-mode-readpaths.png
index 7644a3bdddf..9a1b9113d48 100644
Binary files a/static/images/variant/variant-doc-mode-readpaths.png and
b/static/images/variant/variant-doc-mode-readpaths.png differ
diff --git a/static/images/variant/variant-doc-mode.png
b/static/images/variant/variant-doc-mode.png
index 90dc313dcad..7ac33b57853 100644
Binary files a/static/images/variant/variant-doc-mode.png and
b/static/images/variant/variant-doc-mode.png differ
diff --git a/static/images/variant/variant-sparse-sharding.png
b/static/images/variant/variant-sparse-sharding.png
index dae6be1db5b..72bc7bfb568 100644
Binary files a/static/images/variant/variant-sparse-sharding.png and
b/static/images/variant/variant-sparse-sharding.png differ
diff --git a/static/images/variant/variant-sparse-storage.png
b/static/images/variant/variant-sparse-storage.png
index 5169dc1ab05..8b43c011982 100644
Binary files a/static/images/variant/variant-sparse-storage.png and
b/static/images/variant/variant-sparse-storage.png differ
diff --git
a/versioned_docs/version-3.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
b/versioned_docs/version-3.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
index 1563fc80178..cbd12b9a27e 100644
---
a/versioned_docs/version-3.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
+++
b/versioned_docs/version-3.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
@@ -47,7 +47,7 @@ Before reading the storage modes below, make sure these terms
are clear. Each is
**Subcolumnization.** When data is written into a `VARIANT` column, Doris
automatically discovers JSON paths and extracts hot paths as independent
columnar subcolumns for efficient analytics.
-
+<img src="/images/variant/variant-default-storage.png" alt="Default VARIANT:
Automatic Subcolumn Extraction" width="720" />
**Schema Template (3.1+).** A declaration on a `VARIANT` column that pins
selected paths to stable types. Use it for key business fields that must stay
typed, indexable, and predictable. Do not try to enumerate every possible path.
@@ -55,13 +55,13 @@ Before reading the storage modes below, make sure these
terms are clear. Each is
**Sparse columns (3.1+).** When wide JSON has a clear hot/cold split, sparse
columns keep hot paths in Subcolumnization while pushing cold (long-tail) paths
into shared sparse storage. Use `variant_max_subcolumns_count` to control the
boundary.
-
+<img src="/images/variant/variant-sparse-storage.png" alt="Sparse Columns:
Hot/Cold Path Separation" width="720" />
As shown above, hot paths (such as `user_id`, `page`) stay as independent
columnar subcolumns with full analytics speed, while thousands of long-tail
paths converge into shared sparse storage. The threshold is controlled by
`variant_max_subcolumns_count`.
## Recommended Decision Path
-
+<img src="/images/variant/variant-decision-flowchart-3x.png" alt="VARIANT Mode
Decision Path (Doris 3.x)" width="600" />
For wide JSON where most queries return the whole document, Doris 3.x
`VARIANT` is usually not the best fit because there is no DOC mode. Avoid
making `SELECT variant_col` the main query pattern on very wide columns.
@@ -173,7 +173,7 @@ Watch for:
The chart below compares single-path extraction time on a 10K-path wide-column
dataset (200K rows, extracting one key, 16 CPUs, median of 3 runs).
-
+<img src="/images/variant/variant-bench-query-time-3x.svg" alt="Wide-Column
Single-Path Extraction: Query Time" width="720" />
| Mode | Query Time | Peak Memory |
|---|---:|---:|
diff --git
a/versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
b/versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
index 3dcf21b6eaa..9a5857aa770 100644
---
a/versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
+++
b/versioned_docs/version-4.x/sql-manual/basic-element/sql-data-types/semi-structured/variant-workload-guide.md
@@ -64,7 +64,7 @@ Before reading the storage modes below, make sure these terms
are clear. Each is
**Subcolumnization.** When data is written into a `VARIANT` column, Doris
automatically discovers JSON paths and extracts hot paths as independent
columnar subcolumns for efficient analytics.
-
+<img src="/images/variant/variant-default-storage.png" alt="Default VARIANT:
Automatic Subcolumn Extraction" width="720" />
**Schema Template.** A declaration on a `VARIANT` column that pins selected
paths to stable types. Use it for key business fields that must stay typed,
indexable, and predictable. Do not try to enumerate every possible path.
@@ -72,23 +72,23 @@ Before reading the storage modes below, make sure these
terms are clear. Each is
**Sparse columns.** When wide JSON has a clear hot/cold split, sparse columns
keep hot paths in Subcolumnization while pushing cold (long-tail) paths into
shared sparse storage. Sparse storage supports sharding across multiple
physical columns for better read parallelism.
-
+<img src="/images/variant/variant-sparse-storage.png" alt="Sparse Columns:
Hot/Cold Path Separation" width="720" />
As shown above, hot paths (such as `user_id`, `page`) stay as independent
columnar subcolumns with full analytics speed, while thousands of long-tail
paths converge into shared sparse storage. The threshold is controlled by
`variant_max_subcolumns_count`.
**Sparse sharding.** When the long-tail path count is very large, a single
sparse column can become a read bottleneck. Sparse sharding distributes
long-tail paths by hash across multiple physical columns
(`variant_sparse_hash_shard_count`), so they can be scanned in parallel.
-
+<img src="/images/variant/variant-sparse-sharding.png" alt="Sparse Sharding:
Parallel Read for Long-Tail Paths" width="720" />
**DOC mode.** Delays Subcolumnization at write time and additionally stores
the original JSON as a map-format stored field (the **doc map**). This gives
fast ingest and efficient whole-document return at the cost of extra storage.
Subcolumnization still happens later during compaction.
-
+<img src="/images/variant/variant-doc-mode.png" alt="DOC Mode: Deferred
Extraction + Fast Document Return" width="700" />
As illustrated above, during write the JSON is preserved as-is into a Doc
Store for fast ingest. Subcolumns are extracted later during compaction. At
read time, path-based queries (e.g. `SELECT v['user_id']`) read from
materialized subcolumns at full columnar speed, while whole-document queries
(`SELECT v`) read directly from the Doc Store without reconstructing from
subcolumns.
DOC mode has three distinct read paths depending on whether the queried path
has been materialized:
-
+<img src="/images/variant/variant-doc-mode-readpaths.png" alt="DOC Mode: Read
Path Details" width="720" />
- **DOC Materialized**: The queried path has already been extracted into a
subcolumn (after compaction or when `variant_doc_materialization_min_rows` is
met). Reads at full columnar speed, same as default VARIANT.
- **DOC Map**: The queried path has not been materialized yet. The query falls
back to scanning the entire doc map to find the value — significantly slower on
wide JSON.
@@ -98,7 +98,7 @@ DOC mode has three distinct read paths depending on whether
the queried path has
## Recommended Decision Path
-
+<img src="/images/variant/variant-decision-flowchart.png" alt="VARIANT Mode
Decision Path" width="520" />
## Storage Modes
@@ -248,7 +248,7 @@ Watch for:
The chart below compares single-path extraction time on a 10K-path wide-column
dataset (200K rows, extracting one key, 16 CPUs, median of 3 runs).
-
+<img src="/images/variant/variant-bench-query-time.svg" alt="Wide-Column
Single-Path Extraction: Query Time" width="720" />
| Mode | Query Time | Peak Memory |
|---|---:|---:|
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]