This is an automated email from the ASF dual-hosted git repository.
eldenmoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new 15ab1ce13c modify github events sample download link, and add some FQA
(#1110)
15ab1ce13c is described below
commit 15ab1ce13cd17ca8395b560272f6e2aa8f4ed08f
Author: lihangyu <[email protected]>
AuthorDate: Wed Sep 18 09:36:37 2024 +0800
modify github events sample download link, and add some FQA (#1110)
# Versions
- [x] dev
- [x] 3.0
- [x] 2.1
- [ ] 2.0
# Languages
- [x] Chinese
- [x] English
---
blog/variant-in-apache-doris-2.1.md | 5 +++--
docs/sql-manual/sql-data-types/semi-structured/VARIANT.md | 9 +++++++--
.../sql-manual/sql-data-types/semi-structured/VARIANT.md | 10 ++++++++--
.../sql-manual/sql-data-types/semi-structured/VARIANT.md | 9 +++++++--
.../sql-manual/sql-data-types/semi-structured/VARIANT.md | 7 ++++++-
.../sql-manual/sql-data-types/semi-structured/VARIANT.md | 9 +++++++--
.../sql-manual/sql-data-types/semi-structured/VARIANT.md | 9 +++++++--
.../sql-manual/sql-data-types/semi-structured/VARIANT.md | 9 +++++++--
.../sql-manual/sql-data-types/semi-structured/VARIANT.md | 9 +++++++--
9 files changed, 59 insertions(+), 17 deletions(-)
diff --git a/blog/variant-in-apache-doris-2.1.md
b/blog/variant-in-apache-doris-2.1.md
index efe2fce906..3056e69f9f 100644
--- a/blog/variant-in-apache-doris-2.1.md
+++ b/blog/variant-in-apache-doris-2.1.md
@@ -157,7 +157,7 @@ properties("replication_num" = "1");
Load the `gh_2022-11-07-3.json` file, which is Github Events records of an
hour. One formatted row of it looks like this:
```JSON
-wget
http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json
+wget
https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json
curl --location-trusted -u root: -T gh_2022-11-07-3.json -H
"read_json_by_line:true" -H "format:json"
http://127.0.0.1:18148/api/test_variant/github_events/_strea
m_load
@@ -286,9 +286,10 @@ mysql> SELECT
2. Count the number of events containing the keyword `doris`.
```sql
+-- implicit cast `payload['comment']['body']` to string type
mysql> SELECT
-> count() FROM github_events
- -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris';
+ -> WHERE payload['comment']['body'] MATCH 'doris';
+---------+
| count() |
+---------+
diff --git a/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md
b/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md
index 0b25e2bf47..83280efd71 100644
--- a/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md
+++ b/docs/sql-manual/sql-data-types/semi-structured/VARIANT.md
@@ -171,7 +171,7 @@ properties("replication_num" = "1");
Importing gh_2022-11-07-3.json, which contains one hour's worth of GitHub
events data.
``` shell
-wget
http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json
+wget
https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json
curl --location-trusted -u root: -T gh_2022-11-07-3.json -H
"read_json_by_line:true" -H "format:json"
http://127.0.0.1:18148/api/test_variant/github_events/_strea
m_load
@@ -300,9 +300,10 @@ mysql> SELECT
2. Retrieve the count of comments containing "doris".
``` sql
+-- implicit cast `payload['comment']['body']` to string type
mysql> SELECT
-> count() FROM github_events
- -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris';
+ -> WHERE payload['comment']['body'] MATCH 'doris';
+---------+
| count() |
+---------+
@@ -360,6 +361,10 @@ When the above types cannot be compatible, they will be
transformed into JSON ty
- Not supported as primary or sort keys.
- Queries with filters or aggregations require casting. The storage layer
eliminates cast operations based on storage type and the target type of the
cast, speeding up queries.
+### FAQ
+1.Streamload Error: [CANCELLED][INTERNAL_ERROR] tablet error:
[DATA_QUALITY_ERROR] Reached max column size limit 2048.
+Due to compaction and metadata storage limitations, the VARIANT type imposes a
limit on the number of columns, with the default being 2048 columns. You can
adjust the BE configuration `variant_max_merged_tablet_schema_size`
accordingly, but it is not recommended to exceed 4096 columns.
+
### Keywords
VARIANT
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-data-types/semi-structured/VARIANT.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-data-types/semi-structured/VARIANT.md
index c3996a9f1b..3213e74dd5 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-data-types/semi-structured/VARIANT.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-data-types/semi-structured/VARIANT.md
@@ -171,7 +171,7 @@ properties("replication_num" = "1");
导入 gh_2022-11-07-3.json,这是 github events 一个小时的数据
``` shell
-wget
http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json
+wget
https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json
curl --location-trusted -u root: -T gh_2022-11-07-3.json -H
"read_json_by_line:true" -H "format:json"
http://127.0.0.1:18148/api/test_variant/github_events/_strea
m_load
@@ -304,9 +304,10 @@ mysql> SELECT
2. 获取评论中包含 doris 的数量
``` sql
+-- implicit cast `payload['comment']['body']` to string type
mysql> SELECT
-> count() FROM github_events
- -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris';
+ -> WHERE payload['comment']['body'] MATCH 'doris';
+---------+
| count() |
+---------+
@@ -364,6 +365,11 @@ VARIANT 动态列与预定义静态列几乎一样高效。处理诸如日志之
- 不支持作为主键或者排序键
- 查询过滤、聚合需要带 cast,存储层会根据存储类型和 cast 目标类型来消除 cast 操作,加速查询。
+### FAQ
+1. Stream Load 报错: [CANCELLED][INTERNAL_ERROR]tablet error:
[DATA_QUALITY_ERROR]Reached max column size limit 2048。
+由于 Compaction 和元信息存储限制, VARIANT 类型会限制列数,默认 2048 列,可以适当调整 BE 配置
`variant_max_merged_tablet_schema_size` , 但是不建议超过 4096
+
+
### Keywords
VARIANT
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
index c3996a9f1b..f6a6260c58 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
@@ -171,7 +171,7 @@ properties("replication_num" = "1");
导入 gh_2022-11-07-3.json,这是 github events 一个小时的数据
``` shell
-wget
http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json
+wget
https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json
curl --location-trusted -u root: -T gh_2022-11-07-3.json -H
"read_json_by_line:true" -H "format:json"
http://127.0.0.1:18148/api/test_variant/github_events/_strea
m_load
@@ -304,9 +304,10 @@ mysql> SELECT
2. 获取评论中包含 doris 的数量
``` sql
+-- implicit cast `payload['comment']['body']` to string type
mysql> SELECT
-> count() FROM github_events
- -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris';
+ -> WHERE payload['comment']['body'] MATCH 'doris';
+---------+
| count() |
+---------+
@@ -364,6 +365,10 @@ VARIANT 动态列与预定义静态列几乎一样高效。处理诸如日志之
- 不支持作为主键或者排序键
- 查询过滤、聚合需要带 cast,存储层会根据存储类型和 cast 目标类型来消除 cast 操作,加速查询。
+### FAQ
+1. Stream Load 报错: [CANCELLED][INTERNAL_ERROR]tablet error:
[DATA_QUALITY_ERROR]Reached max column size limit 2048。
+由于 Compaction 和元信息存储限制, VARIANT 类型会限制列数,默认 2048 列,可以适当调整 BE 配置
`variant_max_merged_tablet_schema_size` , 但是不建议超过 4096
+
### Keywords
VARIANT
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md
index c3996a9f1b..cfc687415b 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md
@@ -171,7 +171,7 @@ properties("replication_num" = "1");
导入 gh_2022-11-07-3.json,这是 github events 一个小时的数据
``` shell
-wget
http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json
+wget
https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json
curl --location-trusted -u root: -T gh_2022-11-07-3.json -H
"read_json_by_line:true" -H "format:json"
http://127.0.0.1:18148/api/test_variant/github_events/_strea
m_load
@@ -304,6 +304,7 @@ mysql> SELECT
2. 获取评论中包含 doris 的数量
``` sql
+-- implicit cast `payload['comment']['body']` to string type
mysql> SELECT
-> count() FROM github_events
-> WHERE cast(payload['comment']['body'] as text) MATCH 'doris';
@@ -364,6 +365,10 @@ VARIANT 动态列与预定义静态列几乎一样高效。处理诸如日志之
- 不支持作为主键或者排序键
- 查询过滤、聚合需要带 cast,存储层会根据存储类型和 cast 目标类型来消除 cast 操作,加速查询。
+### FAQ
+1. Stream Load 报错: [CANCELLED][INTERNAL_ERROR]tablet error:
[DATA_QUALITY_ERROR]Reached max column size limit 2048。
+由于 Compaction 和元信息存储限制, VARIANT 类型会限制列数,默认 2048 列,可以适当调整 BE 配置
`variant_max_merged_tablet_schema_size` , 但是不建议超过 4096
+
### Keywords
VARIANT
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
index c3996a9f1b..f6a6260c58 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
@@ -171,7 +171,7 @@ properties("replication_num" = "1");
导入 gh_2022-11-07-3.json,这是 github events 一个小时的数据
``` shell
-wget
http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json
+wget
https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json
curl --location-trusted -u root: -T gh_2022-11-07-3.json -H
"read_json_by_line:true" -H "format:json"
http://127.0.0.1:18148/api/test_variant/github_events/_strea
m_load
@@ -304,9 +304,10 @@ mysql> SELECT
2. 获取评论中包含 doris 的数量
``` sql
+-- implicit cast `payload['comment']['body']` to string type
mysql> SELECT
-> count() FROM github_events
- -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris';
+ -> WHERE payload['comment']['body'] MATCH 'doris';
+---------+
| count() |
+---------+
@@ -364,6 +365,10 @@ VARIANT 动态列与预定义静态列几乎一样高效。处理诸如日志之
- 不支持作为主键或者排序键
- 查询过滤、聚合需要带 cast,存储层会根据存储类型和 cast 目标类型来消除 cast 操作,加速查询。
+### FAQ
+1. Stream Load 报错: [CANCELLED][INTERNAL_ERROR]tablet error:
[DATA_QUALITY_ERROR]Reached max column size limit 2048。
+由于 Compaction 和元信息存储限制, VARIANT 类型会限制列数,默认 2048 列,可以适当调整 BE 配置
`variant_max_merged_tablet_schema_size` , 但是不建议超过 4096
+
### Keywords
VARIANT
diff --git
a/versioned_docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
b/versioned_docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
index 0b25e2bf47..83280efd71 100644
---
a/versioned_docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
+++
b/versioned_docs/version-2.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
@@ -171,7 +171,7 @@ properties("replication_num" = "1");
Importing gh_2022-11-07-3.json, which contains one hour's worth of GitHub
events data.
``` shell
-wget
http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json
+wget
https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json
curl --location-trusted -u root: -T gh_2022-11-07-3.json -H
"read_json_by_line:true" -H "format:json"
http://127.0.0.1:18148/api/test_variant/github_events/_strea
m_load
@@ -300,9 +300,10 @@ mysql> SELECT
2. Retrieve the count of comments containing "doris".
``` sql
+-- implicit cast `payload['comment']['body']` to string type
mysql> SELECT
-> count() FROM github_events
- -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris';
+ -> WHERE payload['comment']['body'] MATCH 'doris';
+---------+
| count() |
+---------+
@@ -360,6 +361,10 @@ When the above types cannot be compatible, they will be
transformed into JSON ty
- Not supported as primary or sort keys.
- Queries with filters or aggregations require casting. The storage layer
eliminates cast operations based on storage type and the target type of the
cast, speeding up queries.
+### FAQ
+1.Streamload Error: [CANCELLED][INTERNAL_ERROR] tablet error:
[DATA_QUALITY_ERROR] Reached max column size limit 2048.
+Due to compaction and metadata storage limitations, the VARIANT type imposes a
limit on the number of columns, with the default being 2048 columns. You can
adjust the BE configuration `variant_max_merged_tablet_schema_size`
accordingly, but it is not recommended to exceed 4096 columns.
+
### Keywords
VARIANT
diff --git
a/versioned_docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md
b/versioned_docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md
index 0b25e2bf47..83280efd71 100644
---
a/versioned_docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md
+++
b/versioned_docs/version-2.1/sql-manual/sql-data-types/semi-structured/VARIANT.md
@@ -171,7 +171,7 @@ properties("replication_num" = "1");
Importing gh_2022-11-07-3.json, which contains one hour's worth of GitHub
events data.
``` shell
-wget
http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json
+wget
https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json
curl --location-trusted -u root: -T gh_2022-11-07-3.json -H
"read_json_by_line:true" -H "format:json"
http://127.0.0.1:18148/api/test_variant/github_events/_strea
m_load
@@ -300,9 +300,10 @@ mysql> SELECT
2. Retrieve the count of comments containing "doris".
``` sql
+-- implicit cast `payload['comment']['body']` to string type
mysql> SELECT
-> count() FROM github_events
- -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris';
+ -> WHERE payload['comment']['body'] MATCH 'doris';
+---------+
| count() |
+---------+
@@ -360,6 +361,10 @@ When the above types cannot be compatible, they will be
transformed into JSON ty
- Not supported as primary or sort keys.
- Queries with filters or aggregations require casting. The storage layer
eliminates cast operations based on storage type and the target type of the
cast, speeding up queries.
+### FAQ
+1.Streamload Error: [CANCELLED][INTERNAL_ERROR] tablet error:
[DATA_QUALITY_ERROR] Reached max column size limit 2048.
+Due to compaction and metadata storage limitations, the VARIANT type imposes a
limit on the number of columns, with the default being 2048 columns. You can
adjust the BE configuration `variant_max_merged_tablet_schema_size`
accordingly, but it is not recommended to exceed 4096 columns.
+
### Keywords
VARIANT
diff --git
a/versioned_docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
b/versioned_docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
index 0b25e2bf47..83280efd71 100644
---
a/versioned_docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
+++
b/versioned_docs/version-3.0/sql-manual/sql-data-types/semi-structured/VARIANT.md
@@ -171,7 +171,7 @@ properties("replication_num" = "1");
Importing gh_2022-11-07-3.json, which contains one hour's worth of GitHub
events data.
``` shell
-wget
http://doris-build-hk-1308700295.cos.ap-hongkong.myqcloud.com/regression/variant/gh_2022-11-07-3.json
+wget
https://qa-build.oss-cn-beijing.aliyuncs.com/regression/variant/gh_2022-11-07-3.json
curl --location-trusted -u root: -T gh_2022-11-07-3.json -H
"read_json_by_line:true" -H "format:json"
http://127.0.0.1:18148/api/test_variant/github_events/_strea
m_load
@@ -300,9 +300,10 @@ mysql> SELECT
2. Retrieve the count of comments containing "doris".
``` sql
+-- implicit cast `payload['comment']['body']` to string type
mysql> SELECT
-> count() FROM github_events
- -> WHERE cast(payload['comment']['body'] as text) MATCH 'doris';
+ -> WHERE payload['comment']['body'] MATCH 'doris';
+---------+
| count() |
+---------+
@@ -360,6 +361,10 @@ When the above types cannot be compatible, they will be
transformed into JSON ty
- Not supported as primary or sort keys.
- Queries with filters or aggregations require casting. The storage layer
eliminates cast operations based on storage type and the target type of the
cast, speeding up queries.
+### FAQ
+1.Streamload Error: [CANCELLED][INTERNAL_ERROR] tablet error:
[DATA_QUALITY_ERROR] Reached max column size limit 2048.
+Due to compaction and metadata storage limitations, the VARIANT type imposes a
limit on the number of columns, with the default being 2048 columns. You can
adjust the BE configuration `variant_max_merged_tablet_schema_size`
accordingly, but it is not recommended to exceed 4096 columns.
+
### Keywords
VARIANT
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]