wombatu-kun commented on issue #16756:
URL: https://github.com/apache/iceberg/issues/16756#issuecomment-4687222121

   Strong +1 on the framing. The reason #3681 / #4994 / #4625 stalled is that 
each one asked Iceberg to *understand* these definitions in some form, and 
Approach 4 sidesteps that completely: the format never reads, validates, or 
evolves the data, so ownership stays in Flink where watermark and 
computed-column semantics actually live. That preserves engine neutrality, 
which was the core objection in all three.
   
   One concrete point that strengthens the motivation: today this isn't even a 
silent drop. `FlinkCatalog.validateFlinkTable()` hard-rejects both up front - 
`UnsupportedOperationException("Creating table with computed columns is not 
supported yet.")` and the matching `"... watermark specs ..."` - and 
`FlinkSchemaUtil.toResolvedSchema()` rebuilds the Flink schema with 
`Collections.emptyList()` for watermarks. So the two-table workaround is a 
direct consequence of that, and Approach 4 is purely additive: relax the 
validation, serialize the specs on `createTable`, restore them in `getTable` -> 
`toCatalogTableWithProps`. No iceberg-core or spec change, and tables stay 
invisible to engines that don't read the namespace.
   
   On the open storage question, FlinkCatalog already owns a reserved-property 
mechanism (`isReservedProperty` over `connector` / `src-catalog` / `location` 
from `FlinkCreateTableOptions`) that it writes on create and hides on load. 
Folding the Flink metadata into a single structured, Flink-namespaced property 
through that same seam would give Approach 4 the structure it wants while 
avoiding a separate metadata-file lifecycle (commit atomicity, orphan cleanup). 
The "unstructured / mixed with Iceberg metadata" con you list for Approach 2 is 
really about ad-hoc flat keys; a single well-defined JSON blob owned and hidden 
by FlinkCatalog doesn't have that problem. A dedicated file stays a clean 
fallback if the blob ever outgrows properties.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to