Hi Iceberg Community,
We would like to start a discussion about introducing native primary-key
table support in Apache Iceberg.
Background
==========
Apache Iceberg has become a widely adopted table format for large-scale
analytic datasets and provides strong support for schema evolution,
partition evolution, row-level operations, and incremental processing.
At the same time, an increasing number of users are building CDC-driven and
operational analytics workloads where data is naturally organized around
primary keys and continuously updated through inserts, updates, and deletes.
While Iceberg provides important building blocks such as identifier fields,
equality deletes, position deletes, and MERGE operations, there is
currently no standardized primary-key table abstraction within the Iceberg
specification.
Motivation
==========
Many modern data lake workloads rely on:
* Database CDC ingestion
* Streaming upsert pipelines
* Data synchronization between transactional systems and data lakes
* Near real-time operational analytics
* Incremental changelog consumption
These workloads often require:
* Primary-key based update semantics
* Efficient handling of high-frequency updates and deletes
* Storage layouts optimized for mutable data
* Efficient compaction strategies
* Standardized changelog generation and consumption
Today, users typically implement these capabilities through engine-specific
solutions or custom ingestion frameworks, which can lead to inconsistent
behavior across engines and increased operational complexity.
Existing Iceberg Capabilities and Gaps
======================================
Iceberg already provides several important capabilities for mutable
datasets:
* Identifier fields
* Equality deletes
* Position deletes
* MERGE INTO support through compute engines
* Incremental snapshot processing
However, these features primarily serve as low-level primitives and do not
provide a complete primary-key table model.
For example:
* Identifier fields define row identity but do not provide write semantics.
* MERGE operations are engine-specific and may behave differently across
engines.
* Equality deletes can become expensive for heavy CDC workloads.
* There is currently no standard mechanism for organizing data around
primary keys or exposing changelog semantics.
As a result, users building CDC and streaming upsert workloads often need
significant custom infrastructure on top of Iceberg.
Industry Context
================
Several lakehouse systems have introduced native support for
primary-key-oriented workloads.
For example, Apache Paimon provides primary-key tables with built-in
support for upserts, changelog production, and storage layouts optimized
for mutable data. These capabilities have proven useful for streaming and
CDC scenarios.
At the same time, many organizations have already standardized on Iceberg
as their table format and would benefit from similar capabilities without
requiring adoption of a separate table format.
This raises the question of whether a standardized primary-key table
abstraction should be part of Iceberg itself.
Initial Proposal
================
We would like to discuss introducing a first-class primary-key table
abstraction in Iceberg.
Conceptually, users could define tables such as:
CREATE TABLE orders (
order_id BIGINT PRIMARY KEY,
customer_id BIGINT,
amount DECIMAL(18,2),
updated_at TIMESTAMP
);
The intent is not to provide OLTP-style uniqueness enforcement or database
constraints.
Instead, the goal is to provide a standard storage and processing model for
mutable datasets organized around primary keys.
Potential capabilities could include:
* Primary-key metadata stored as part of table metadata
* Standardized primary-key write semantics
* Primary-key aware compaction and maintenance
* Efficient changelog generation for downstream consumers
* Optimized storage organization for mutable workloads
* Consistent behavior across engines
The feature would be optional and would not affect existing Iceberg tables
or workloads.
Open Questions
==============
We would appreciate feedback from the community on the following topics:
1. Is a native primary-key table abstraction within the scope and vision of
Iceberg?
2. Are existing Iceberg features sufficient to address these use cases?
3. What are the advantages or disadvantages of introducing primary-key
semantics at the table-format level?
4. Should Iceberg standardize changelog and mutable-data handling for CDC
workloads?
5. What compatibility or interoperability concerns should be considered?
6. Would the community be interested in reviewing a detailed design
proposal if there is agreement on the problem statement?
At Huawei, we have been experimenting with primary-key table semantics in
production environments for CDC-driven and mutable-data workloads. The
experience has highlighted both the demand for these capabilities and the
challenges of building them consistently on top of existing primitives.
Based on these experiences, we would like to discuss whether a standardized
approach belongs in Iceberg.
If there is interest from the community, we would be happy to share a
detailed design proposal covering metadata representation, write/read
semantics, compaction strategies, changelog support, and engine
integrations.
Looking forward to hearing the community's thoughts.
Thank you for your consideration,
Chandra Sekhar