Hi folks,

I started to work on multi args transforms, and you probably saw
Fokko's proposal about the way to deal with source-id/source-ids to
ensure backward compatibility.

While working on the changes on iceberg-core/iceberg-java, I'm
wondering if we should not introduce Iceberg Features on metadata.
Let me explain what I have in mind.
In Table Spec V3, we have new functionalities: new types (timestamp
nz, variant, ...), default values, row lineage, etc.
For readers/writers, there are two ways to know if functionalities are
available or not:
1. Reading the table version spec (v2, v3)
2. Reading if metadata contains some fields (for instance, regarding
multi args transforms, we have source-id / source-ids).
It means that we already have to "parse" the metadata and likely
implement "complex" logic.

In addition of table spec version, I wonder if we should not introduce
Iceberg Features in metadata, clearly listing/describing the supported
features, decoupled from table spec version:

"features": ["row_lineage","variant","default_value"]

Reader/writer can just check the features to know how to behave. We
would like more flexible to support features, unbinding from the table
spec version.

Afaik, Delta has something similar.

Long term, it could be extended to Data File format API proposed by
Peter, e.g. some features related to data files (that would be a
different layer, but similar idea).

Thoughts ?

Regards
JB

Reply via email to