[ 
https://issues.apache.org/jira/browse/FLINK-39019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

suhwan updated FLINK-39019:
---------------------------
    Description: 
*Summary*

Implement deletion vector support in Flink CDC Iceberg connector to leverage 
format-version 3 performance benefits for CDC workloads.

 

*Content*

[https://iceberg.apache.org/spec/#deletion-vectors]

Iceberg format-version 3 introduces binary deletion vectors which replace 
positional delete files with compact bitmaps stored in Puffin files

(Current implementation uses equality delete files via 
`RowDataTaskWriterFactory`)

 

In most case (especially CDC), deletion vector is more efficient than 
positional delete files

(I refered 
[https://opensource.googleblog.com/2025/08/whats-new-in-iceberg-v3.html,] 
[https://www.dremio.com/blog/apache-iceberg-v3/] )

 

*Proposed Changes*
 # Check `table.properties().get("format-version")`
 ## For v3 tables with id fields -> use deletion vector
 ## etc tables -> use equality delete files

  was:
*Summary*

Implement deletion vector support in Flink CDC Iceberg connector to leverage 
format-version 3 performance benefits for CDC workloads.

 

*Content*

https://iceberg.apache.org/spec/#deletion-vectors

Iceberg format-version 3 introduces binary deletion vectors which replace 
positional delete files with compact bitmaps stored in Puffin files

(Current implementation uses equality delete files via 
`RowDataTaskWriterFactory`)

 

*Proposed Changes*
 # Check `table.properties().get("format-version")`
 ## For v3 tables with id fields -> use deletion vector
 ## etc tables -> use equality delete files


> [Iceberg] Support deletion vectors for format-version 3 tables
> --------------------------------------------------------------
>
>                 Key: FLINK-39019
>                 URL: https://issues.apache.org/jira/browse/FLINK-39019
>             Project: Flink
>          Issue Type: New Feature
>          Components: Flink CDC
>            Reporter: suhwan
>            Priority: Not a Priority
>              Labels: iceberg
>
> *Summary*
> Implement deletion vector support in Flink CDC Iceberg connector to leverage 
> format-version 3 performance benefits for CDC workloads.
>  
> *Content*
> [https://iceberg.apache.org/spec/#deletion-vectors]
> Iceberg format-version 3 introduces binary deletion vectors which replace 
> positional delete files with compact bitmaps stored in Puffin files
> (Current implementation uses equality delete files via 
> `RowDataTaskWriterFactory`)
>  
> In most case (especially CDC), deletion vector is more efficient than 
> positional delete files
> (I refered 
> [https://opensource.googleblog.com/2025/08/whats-new-in-iceberg-v3.html,] 
> [https://www.dremio.com/blog/apache-iceberg-v3/] )
>  
> *Proposed Changes*
>  # Check `table.properties().get("format-version")`
>  ## For v3 tables with id fields -> use deletion vector
>  ## etc tables -> use equality delete files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to