This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 89053bc011e [opt](iceberg) add ignore_iceberg_dangling_delete (#3191)
89053bc011e is described below

commit 89053bc011e0270b99c85dd1e2e031ba31fe5fe4
Author: Mingyu Chen (Rayner) <[email protected]>
AuthorDate: Mon Dec 15 11:38:13 2025 +0800

    [opt](iceberg) add ignore_iceberg_dangling_delete (#3191)
    
    ## Versions
    
    - [x] dev
    - [x] 4.x
    - [x] 3.x
    - [ ] 2.1
    
    ## Languages
    
    - [x] Chinese
    - [x] English
    
    ## Docs Checklist
    
    - [ ] Checked by AI
    - [ ] Test Cases Built
---
 docs/lakehouse/catalogs/iceberg-catalog.mdx                  | 12 +++++++++++-
 .../current/lakehouse/catalogs/iceberg-catalog.mdx           | 10 ++++++++++
 .../version-2.1/lakehouse/catalogs/iceberg-catalog.mdx       | 10 ++++++++++
 .../version-3.x/lakehouse/catalogs/iceberg-catalog.mdx       | 10 ++++++++++
 .../version-4.x/lakehouse/catalogs/iceberg-catalog.mdx       | 10 ++++++++++
 .../version-2.1/lakehouse/catalogs/iceberg-catalog.mdx       | 12 +++++++++++-
 .../version-3.x/lakehouse/catalogs/iceberg-catalog.mdx       | 12 +++++++++++-
 .../version-4.x/lakehouse/catalogs/iceberg-catalog.mdx       | 12 +++++++++++-
 8 files changed, 84 insertions(+), 4 deletions(-)

diff --git a/docs/lakehouse/catalogs/iceberg-catalog.mdx 
b/docs/lakehouse/catalogs/iceberg-catalog.mdx
index 90d413b0a91..80e24bd3e70 100644
--- a/docs/lakehouse/catalogs/iceberg-catalog.mdx
+++ b/docs/lakehouse/catalogs/iceberg-catalog.mdx
@@ -1273,7 +1273,7 @@ For the `FOR VERSION AS OF` syntax, Doris will 
automatically determine whether t
 
 ### View
 
-> Since version 3.1.0
+> Since 3.1.0
 
 Supports querying Iceberg views. View queries work the same way as regular 
table queries. Please note the following:
 
@@ -2314,6 +2314,16 @@ You can use the following SQL to analyze the data 
distribution and delete file c
   +--------------------+---------+-------------+---------+
   ```
 
+### Dangling Delete
+
+In some cases, after executing the `rewrite_data_files` action, references to 
certain Position Deletes may not have been removed from the Snapshot metadata 
(Dangling Delete). If you directly use the row number information from the 
metadata in this situation, the result may be incorrect.
+
+Therefore, by default, for `COUNT(*)` queries, if a Position Delete file is 
found, COUNT pushdown optimization is not enabled; instead, the file is read 
directly to obtain the actual `COUNT(*)` result. However, this method is 
time-consuming.
+
+If the user can ensure there is no Dangling Delete issue, this check can be 
skipped using the Doris session variable `ignore_iceberg_dangling_delete`. This 
variable defaults to `false`. When set to `true`, the system will directly 
return the `COUNT(*)` result based on the row count information in the 
metadata, improving query efficiency.
+
+This feature is supported from versions 3.1.4 and 4.0.3.
+
 ## Appendix
 
 ### `rewrite_data_files` File Selection Strategy
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.mdx
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.mdx
index cff5910a967..cc217d86ee1 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.mdx
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.mdx
@@ -2325,6 +2325,16 @@ EXECUTE set_current_snapshot ("ref" = "v1.0");
   +--------------------+---------+-------------+---------+
   ```
 
+### Dangling Delete
+
+某些情况下,在执行完 `rewrite_data_files` 方法后,某些 Position Delete 的引用可能没有从 Snapshot 
元数据中删除(Dangling 
Delete)。此时如果直接使用元数据中的行数信息,结果可能是[错误](https://iceberg.apache.org/docs/nightly/spark-procedures/#rewrite_position_delete_files)的。
+
+因此,在默认情况下,对于 `COUNT(*)` 查询,如果发现存在 Position Delete 文件,则不启用 COUNT 
下推优化,而是直接读取文件获取真实的 `COUNT(*)` 结果。但这种方式耗时较长。
+
+如果用户能确保没有 Dangling Delete 问题,则可以通过 Doris 会话变量 `ignore_iceberg_dangling_delete` 
来跳过这个检查。该变量默认为 `false`。当设置为 `true` 时,则会直接通过元数据中的行数信息,返回 `COUNT(*)` 的结果,提升查询效率。
+
+该功能自 3.1.4 和 4.0.3 版本支持。
+
 ## 附录
 
 ### `rewrite_data_files` 文件选择策略
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
index cff5910a967..cc217d86ee1 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
@@ -2325,6 +2325,16 @@ EXECUTE set_current_snapshot ("ref" = "v1.0");
   +--------------------+---------+-------------+---------+
   ```
 
+### Dangling Delete
+
+某些情况下,在执行完 `rewrite_data_files` 方法后,某些 Position Delete 的引用可能没有从 Snapshot 
元数据中删除(Dangling 
Delete)。此时如果直接使用元数据中的行数信息,结果可能是[错误](https://iceberg.apache.org/docs/nightly/spark-procedures/#rewrite_position_delete_files)的。
+
+因此,在默认情况下,对于 `COUNT(*)` 查询,如果发现存在 Position Delete 文件,则不启用 COUNT 
下推优化,而是直接读取文件获取真实的 `COUNT(*)` 结果。但这种方式耗时较长。
+
+如果用户能确保没有 Dangling Delete 问题,则可以通过 Doris 会话变量 `ignore_iceberg_dangling_delete` 
来跳过这个检查。该变量默认为 `false`。当设置为 `true` 时,则会直接通过元数据中的行数信息,返回 `COUNT(*)` 的结果,提升查询效率。
+
+该功能自 3.1.4 和 4.0.3 版本支持。
+
 ## 附录
 
 ### `rewrite_data_files` 文件选择策略
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
index cff5910a967..cc217d86ee1 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
@@ -2325,6 +2325,16 @@ EXECUTE set_current_snapshot ("ref" = "v1.0");
   +--------------------+---------+-------------+---------+
   ```
 
+### Dangling Delete
+
+某些情况下,在执行完 `rewrite_data_files` 方法后,某些 Position Delete 的引用可能没有从 Snapshot 
元数据中删除(Dangling 
Delete)。此时如果直接使用元数据中的行数信息,结果可能是[错误](https://iceberg.apache.org/docs/nightly/spark-procedures/#rewrite_position_delete_files)的。
+
+因此,在默认情况下,对于 `COUNT(*)` 查询,如果发现存在 Position Delete 文件,则不启用 COUNT 
下推优化,而是直接读取文件获取真实的 `COUNT(*)` 结果。但这种方式耗时较长。
+
+如果用户能确保没有 Dangling Delete 问题,则可以通过 Doris 会话变量 `ignore_iceberg_dangling_delete` 
来跳过这个检查。该变量默认为 `false`。当设置为 `true` 时,则会直接通过元数据中的行数信息,返回 `COUNT(*)` 的结果,提升查询效率。
+
+该功能自 3.1.4 和 4.0.3 版本支持。
+
 ## 附录
 
 ### `rewrite_data_files` 文件选择策略
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
index cff5910a967..cc217d86ee1 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
@@ -2325,6 +2325,16 @@ EXECUTE set_current_snapshot ("ref" = "v1.0");
   +--------------------+---------+-------------+---------+
   ```
 
+### Dangling Delete
+
+某些情况下,在执行完 `rewrite_data_files` 方法后,某些 Position Delete 的引用可能没有从 Snapshot 
元数据中删除(Dangling 
Delete)。此时如果直接使用元数据中的行数信息,结果可能是[错误](https://iceberg.apache.org/docs/nightly/spark-procedures/#rewrite_position_delete_files)的。
+
+因此,在默认情况下,对于 `COUNT(*)` 查询,如果发现存在 Position Delete 文件,则不启用 COUNT 
下推优化,而是直接读取文件获取真实的 `COUNT(*)` 结果。但这种方式耗时较长。
+
+如果用户能确保没有 Dangling Delete 问题,则可以通过 Doris 会话变量 `ignore_iceberg_dangling_delete` 
来跳过这个检查。该变量默认为 `false`。当设置为 `true` 时,则会直接通过元数据中的行数信息,返回 `COUNT(*)` 的结果,提升查询效率。
+
+该功能自 3.1.4 和 4.0.3 版本支持。
+
 ## 附录
 
 ### `rewrite_data_files` 文件选择策略
diff --git a/versioned_docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx 
b/versioned_docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
index 90d413b0a91..80e24bd3e70 100644
--- a/versioned_docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
+++ b/versioned_docs/version-2.1/lakehouse/catalogs/iceberg-catalog.mdx
@@ -1273,7 +1273,7 @@ For the `FOR VERSION AS OF` syntax, Doris will 
automatically determine whether t
 
 ### View
 
-> Since version 3.1.0
+> Since 3.1.0
 
 Supports querying Iceberg views. View queries work the same way as regular 
table queries. Please note the following:
 
@@ -2314,6 +2314,16 @@ You can use the following SQL to analyze the data 
distribution and delete file c
   +--------------------+---------+-------------+---------+
   ```
 
+### Dangling Delete
+
+In some cases, after executing the `rewrite_data_files` action, references to 
certain Position Deletes may not have been removed from the Snapshot metadata 
(Dangling Delete). If you directly use the row number information from the 
metadata in this situation, the result may be incorrect.
+
+Therefore, by default, for `COUNT(*)` queries, if a Position Delete file is 
found, COUNT pushdown optimization is not enabled; instead, the file is read 
directly to obtain the actual `COUNT(*)` result. However, this method is 
time-consuming.
+
+If the user can ensure there is no Dangling Delete issue, this check can be 
skipped using the Doris session variable `ignore_iceberg_dangling_delete`. This 
variable defaults to `false`. When set to `true`, the system will directly 
return the `COUNT(*)` result based on the row count information in the 
metadata, improving query efficiency.
+
+This feature is supported from versions 3.1.4 and 4.0.3.
+
 ## Appendix
 
 ### `rewrite_data_files` File Selection Strategy
diff --git a/versioned_docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx 
b/versioned_docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
index 90d413b0a91..80e24bd3e70 100644
--- a/versioned_docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
+++ b/versioned_docs/version-3.x/lakehouse/catalogs/iceberg-catalog.mdx
@@ -1273,7 +1273,7 @@ For the `FOR VERSION AS OF` syntax, Doris will 
automatically determine whether t
 
 ### View
 
-> Since version 3.1.0
+> Since 3.1.0
 
 Supports querying Iceberg views. View queries work the same way as regular 
table queries. Please note the following:
 
@@ -2314,6 +2314,16 @@ You can use the following SQL to analyze the data 
distribution and delete file c
   +--------------------+---------+-------------+---------+
   ```
 
+### Dangling Delete
+
+In some cases, after executing the `rewrite_data_files` action, references to 
certain Position Deletes may not have been removed from the Snapshot metadata 
(Dangling Delete). If you directly use the row number information from the 
metadata in this situation, the result may be incorrect.
+
+Therefore, by default, for `COUNT(*)` queries, if a Position Delete file is 
found, COUNT pushdown optimization is not enabled; instead, the file is read 
directly to obtain the actual `COUNT(*)` result. However, this method is 
time-consuming.
+
+If the user can ensure there is no Dangling Delete issue, this check can be 
skipped using the Doris session variable `ignore_iceberg_dangling_delete`. This 
variable defaults to `false`. When set to `true`, the system will directly 
return the `COUNT(*)` result based on the row count information in the 
metadata, improving query efficiency.
+
+This feature is supported from versions 3.1.4 and 4.0.3.
+
 ## Appendix
 
 ### `rewrite_data_files` File Selection Strategy
diff --git a/versioned_docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx 
b/versioned_docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
index 90d413b0a91..80e24bd3e70 100644
--- a/versioned_docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
+++ b/versioned_docs/version-4.x/lakehouse/catalogs/iceberg-catalog.mdx
@@ -1273,7 +1273,7 @@ For the `FOR VERSION AS OF` syntax, Doris will 
automatically determine whether t
 
 ### View
 
-> Since version 3.1.0
+> Since 3.1.0
 
 Supports querying Iceberg views. View queries work the same way as regular 
table queries. Please note the following:
 
@@ -2314,6 +2314,16 @@ You can use the following SQL to analyze the data 
distribution and delete file c
   +--------------------+---------+-------------+---------+
   ```
 
+### Dangling Delete
+
+In some cases, after executing the `rewrite_data_files` action, references to 
certain Position Deletes may not have been removed from the Snapshot metadata 
(Dangling Delete). If you directly use the row number information from the 
metadata in this situation, the result may be incorrect.
+
+Therefore, by default, for `COUNT(*)` queries, if a Position Delete file is 
found, COUNT pushdown optimization is not enabled; instead, the file is read 
directly to obtain the actual `COUNT(*)` result. However, this method is 
time-consuming.
+
+If the user can ensure there is no Dangling Delete issue, this check can be 
skipped using the Doris session variable `ignore_iceberg_dangling_delete`. This 
variable defaults to `false`. When set to `true`, the system will directly 
return the `COUNT(*)` result based on the row count information in the 
metadata, improving query efficiency.
+
+This feature is supported from versions 3.1.4 and 4.0.3.
+
 ## Appendix
 
 ### `rewrite_data_files` File Selection Strategy


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to