This is an automated email from the ASF dual-hosted git repository.
morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new 89053bc011e [opt](iceberg) add ignore_iceberg_dangling_delete (#3191)
89053bc011e is described below
commit 89053bc011e0270b99c85dd1e2e031ba31fe5fe4
Author: Mingyu Chen (Rayner) <[email protected]>
AuthorDate: Mon Dec 15 11:38:13 2025 +0800
[opt](iceberg) add ignore_iceberg_dangling_delete (#3191)
## Versions
- [x] dev
- [x] 4.x
- [x] 3.x
- [ ] 2.1
## Languages
- [x] Chinese
- [x] English
## Docs Checklist
- [ ] Checked by AI
- [ ] Test Cases Built
---
docs/lakehouse/catalogs/iceberg-catalog.mdx | 12 +++++++++++-
.../current/lakehouse/catalogs/iceberg-catalog.mdx | 10 ++++++++++
.../version-2.1/lakehouse/catalogs/iceberg-catalog.mdx | 10 ++++++++++
.../version-3.x/lakehouse/catalogs/iceberg-catalog.mdx | 10 ++++++++++
.../version-4.x/lakehouse/catalogs/iceberg-catalog.mdx | 10 ++++++++++
.../version-2.1/lakehouse/catalogs/iceberg-catalog.mdx | 12 +++++++++++-
.../version-3.x/lakehouse/catalogs/iceberg-catalog.mdx | 12 +++++++++++-
.../version-4.x/lakehouse/catalogs/iceberg-catalog.mdx | 12 +++++++++++-
8 files changed, 84 insertions(+), 4 deletions(-)
diff --git a/docs/lakehouse/catalogs/iceberg-catalog.mdx
b/docs/lakehouse/catalogs/iceberg-catalog.mdx
index 90d413b0a91..80e24bd3e70 100644
--- a/docs/lakehouse/catalogs/iceberg-catalog.mdx
+++ b/docs/lakehouse/catalogs/iceberg-catalog.mdx
@@ -1273,7 +1273,7 @@ For the `FOR VERSION AS OF` syntax, Doris will
automatically determine whether t
### View
-> Since version 3.1.0
+> Since 3.1.0
Supports querying Iceberg views. View queries work the same way as regular
table queries. Please note the following:
@@ -2314,6 +2314,16 @@ You can use the following SQL to analyze the data
distribution and delete file c
+--------------------+---------+-------------+---------+
```
+### Dangling Delete
+
+In some cases, after executing the `rewrite_data_files` action, references to
certain Position Deletes may not have been removed from the Snapshot metadata
(Dangling Delete). If you directly use the row number information from the
metadata in this situation, the result may be incorrect.
+
+Therefore, by default, for `COUNT(*)` queries, if a Position Delete file is
found, COUNT pushdown optimization is not enabled; instead, the file is read
directly to obtain the actual `COUNT(*)` result. However, this method is
time-consuming.
+
+If the user can ensure there is no Dangling Delete issue, this check can be
skipped using the Doris session variable `ignore_iceberg_dangling_delete`. This
variable defaults to `false`. When set to `true`, the system will directly
return the `COUNT(*)` result based on the row count information in the
metadata, improving query efficiency.
+
+This feature is supported from versions 3.1.4 and 4.0.3.
+
## Appendix
### `rewrite_data_files` File Selection Strategy
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.mdx
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.mdx
index cff5910a967..cc217d86ee1 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.mdx
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.mdx
@@ -2325,6 +2325,16 @@ EXECUTE set_current_snapshot ("ref" = "v1.0");
+--------------------+---------+-------------+---------+
```
+### Dangling Delete
+
+某些情况下,在执行完 `rewrite_data_files` 方法后,某些 Position Delete 的引用可能没有从 Snapshot
元数据中删除(Dangling
Delete)。此时如果直接使用元数据中的行数信息,结果可能是[错误](https://iceberg.apache.org/docs/nightly/spark-procedures/#rewrite_position_delete_files)的。
+
+因此,在默认情况下,对于 `COUNT(*)` 查询,如果发现存在 Position Delete 文件,则不启用 COUNT
下推优化,而是直接读取文件获取真实的 `COUNT(*)` 结果。但这种方式耗时较长。
+
+如果用户能确保没有 Dangling Delete 问题,则可以通过 Doris 会话变量 `ignore_iceberg_dangling_delete`
来跳过这个检查。该变量默认为 `false`。当设置为 `true` 时,则会直接通过元数据中的行数信息,返回 `COUNT(*)` 的结果,提升查询效率。
+
+该功能自 3.1.4 和 4.0.3 版本支持。
+
## 附录
### `rewrite_data_files` 文件选择策略
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
index cff5910a967..cc217d86ee1 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
@@ -2325,6 +2325,16 @@ EXECUTE set_current_snapshot ("ref" = "v1.0");
+--------------------+---------+-------------+---------+
```
+### Dangling Delete
+
+某些情况下,在执行完 `rewrite_data_files` 方法后,某些 Position Delete 的引用可能没有从 Snapshot
元数据中删除(Dangling
Delete)。此时如果直接使用元数据中的行数信息,结果可能是[错误](https://iceberg.apache.org/docs/nightly/spark-procedures/#rewrite_position_delete_files)的。
+
+因此,在默认情况下,对于 `COUNT(*)` 查询,如果发现存在 Position Delete 文件,则不启用 COUNT
下推优化,而是直接读取文件获取真实的 `COUNT(*)` 结果。但这种方式耗时较长。
+
+如果用户能确保没有 Dangling Delete 问题,则可以通过 Doris 会话变量 `ignore_iceberg_dangling_delete`
来跳过这个检查。该变量默认为 `false`。当设置为 `true` 时,则会直接通过元数据中的行数信息,返回 `COUNT(*)` 的结果,提升查询效率。
+
+该功能自 3.1.4 和 4.0.3 版本支持。
+
## 附录
### `rewrite_data_files` 文件选择策略
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
index cff5910a967..cc217d86ee1 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
@@ -2325,6 +2325,16 @@ EXECUTE set_current_snapshot ("ref" = "v1.0");
+--------------------+---------+-------------+---------+
```
+### Dangling Delete
+
+某些情况下,在执行完 `rewrite_data_files` 方法后,某些 Position Delete 的引用可能没有从 Snapshot
元数据中删除(Dangling
Delete)。此时如果直接使用元数据中的行数信息,结果可能是[错误](https://iceberg.apache.org/docs/nightly/spark-procedures/#rewrite_position_delete_files)的。
+
+因此,在默认情况下,对于 `COUNT(*)` 查询,如果发现存在 Position Delete 文件,则不启用 COUNT
下推优化,而是直接读取文件获取真实的 `COUNT(*)` 结果。但这种方式耗时较长。
+
+如果用户能确保没有 Dangling Delete 问题,则可以通过 Doris 会话变量 `ignore_iceberg_dangling_delete`
来跳过这个检查。该变量默认为 `false`。当设置为 `true` 时,则会直接通过元数据中的行数信息,返回 `COUNT(*)` 的结果,提升查询效率。
+
+该功能自 3.1.4 和 4.0.3 版本支持。
+
## 附录
### `rewrite_data_files` 文件选择策略
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
index cff5910a967..cc217d86ee1 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
@@ -2325,6 +2325,16 @@ EXECUTE set_current_snapshot ("ref" = "v1.0");
+--------------------+---------+-------------+---------+
```
+### Dangling Delete
+
+某些情况下,在执行完 `rewrite_data_files` 方法后,某些 Position Delete 的引用可能没有从 Snapshot
元数据中删除(Dangling
Delete)。此时如果直接使用元数据中的行数信息,结果可能是[错误](https://iceberg.apache.org/docs/nightly/spark-procedures/#rewrite_position_delete_files)的。
+
+因此,在默认情况下,对于 `COUNT(*)` 查询,如果发现存在 Position Delete 文件,则不启用 COUNT
下推优化,而是直接读取文件获取真实的 `COUNT(*)` 结果。但这种方式耗时较长。
+
+如果用户能确保没有 Dangling Delete 问题,则可以通过 Doris 会话变量 `ignore_iceberg_dangling_delete`
来跳过这个检查。该变量默认为 `false`。当设置为 `true` 时,则会直接通过元数据中的行数信息,返回 `COUNT(*)` 的结果,提升查询效率。
+
+该功能自 3.1.4 和 4.0.3 版本支持。
+
## 附录
### `rewrite_data_files` 文件选择策略
diff --git a/versioned_docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
b/versioned_docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
index 90d413b0a91..80e24bd3e70 100644
--- a/versioned_docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
+++ b/versioned_docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
@@ -1273,7 +1273,7 @@ For the `FOR VERSION AS OF` syntax, Doris will
automatically determine whether t
### View
-> Since version 3.1.0
+> Since 3.1.0
Supports querying Iceberg views. View queries work the same way as regular
table queries. Please note the following:
@@ -2314,6 +2314,16 @@ You can use the following SQL to analyze the data
distribution and delete file c
+--------------------+---------+-------------+---------+
```
+### Dangling Delete
+
+In some cases, after executing the `rewrite_data_files` action, references to
certain Position Deletes may not have been removed from the Snapshot metadata
(Dangling Delete). If you directly use the row number information from the
metadata in this situation, the result may be incorrect.
+
+Therefore, by default, for `COUNT(*)` queries, if a Position Delete file is
found, COUNT pushdown optimization is not enabled; instead, the file is read
directly to obtain the actual `COUNT(*)` result. However, this method is
time-consuming.
+
+If the user can ensure there is no Dangling Delete issue, this check can be
skipped using the Doris session variable `ignore_iceberg_dangling_delete`. This
variable defaults to `false`. When set to `true`, the system will directly
return the `COUNT(*)` result based on the row count information in the
metadata, improving query efficiency.
+
+This feature is supported from versions 3.1.4 and 4.0.3.
+
## Appendix
### `rewrite_data_files` File Selection Strategy
diff --git a/versioned_docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
b/versioned_docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
index 90d413b0a91..80e24bd3e70 100644
--- a/versioned_docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
+++ b/versioned_docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
@@ -1273,7 +1273,7 @@ For the `FOR VERSION AS OF` syntax, Doris will
automatically determine whether t
### View
-> Since version 3.1.0
+> Since 3.1.0
Supports querying Iceberg views. View queries work the same way as regular
table queries. Please note the following:
@@ -2314,6 +2314,16 @@ You can use the following SQL to analyze the data
distribution and delete file c
+--------------------+---------+-------------+---------+
```
+### Dangling Delete
+
+In some cases, after executing the `rewrite_data_files` action, references to
certain Position Deletes may not have been removed from the Snapshot metadata
(Dangling Delete). If you directly use the row number information from the
metadata in this situation, the result may be incorrect.
+
+Therefore, by default, for `COUNT(*)` queries, if a Position Delete file is
found, COUNT pushdown optimization is not enabled; instead, the file is read
directly to obtain the actual `COUNT(*)` result. However, this method is
time-consuming.
+
+If the user can ensure there is no Dangling Delete issue, this check can be
skipped using the Doris session variable `ignore_iceberg_dangling_delete`. This
variable defaults to `false`. When set to `true`, the system will directly
return the `COUNT(*)` result based on the row count information in the
metadata, improving query efficiency.
+
+This feature is supported from versions 3.1.4 and 4.0.3.
+
## Appendix
### `rewrite_data_files` File Selection Strategy
diff --git a/versioned_docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
b/versioned_docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
index 90d413b0a91..80e24bd3e70 100644
--- a/versioned_docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
+++ b/versioned_docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
@@ -1273,7 +1273,7 @@ For the `FOR VERSION AS OF` syntax, Doris will
automatically determine whether t
### View
-> Since version 3.1.0
+> Since 3.1.0
Supports querying Iceberg views. View queries work the same way as regular
table queries. Please note the following:
@@ -2314,6 +2314,16 @@ You can use the following SQL to analyze the data
distribution and delete file c
+--------------------+---------+-------------+---------+
```
+### Dangling Delete
+
+In some cases, after executing the `rewrite_data_files` action, references to
certain Position Deletes may not have been removed from the Snapshot metadata
(Dangling Delete). If you directly use the row number information from the
metadata in this situation, the result may be incorrect.
+
+Therefore, by default, for `COUNT(*)` queries, if a Position Delete file is
found, COUNT pushdown optimization is not enabled; instead, the file is read
directly to obtain the actual `COUNT(*)` result. However, this method is
time-consuming.
+
+If the user can ensure there is no Dangling Delete issue, this check can be
skipped using the Doris session variable `ignore_iceberg_dangling_delete`. This
variable defaults to `false`. When set to `true`, the system will directly
return the `COUNT(*)` result based on the row count information in the
metadata, improving query efficiency.
+
+This feature is supported from versions 3.1.4 and 4.0.3.
+
## Appendix
### `rewrite_data_files` File Selection Strategy
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]