Hi Manu, The Iceberg Schema defines `identifierFieldIds` method [1], and Flink uses that as the primary key. Are you saying there is no way to set it in Spark and Trino?
Thanks, Peter [1] https://github.com/apache/iceberg/blob/9a00f7477dedac4501fb2de9e1e6d7aa83dc20b7/api/src/main/java/org/apache/iceberg/Schema.java#L280 Manu Zhang <owenzhang1...@gmail.com> ezt írta (időpont: 2024. jan. 4., Cs, 16:45): > Hi all, > > Currently, we support upserting a Flink created table with Flink SQL where > primary keys are required as equality fields. They are not required in Java > API. > > However, if the table is created by Spark, where there's no primary key, > we cannot upsert with Flink SQL. Hence, I proposed > https://github.com/apache/iceberg/pull/8195 to support specifying > equality columns with Flink SQL write options. > > @pvary <https://github.com/pvary> suggested it would be better to > support primary keys in Spark, Trino, etc. Since these engines don't have > primary keys in their table definitions, a workaround is to put primary key > columns in table properties. Maybe there are other options I've missed. > > Flink SQL sinking to Spark tables for analysis is a typical pipeline in > our datalake. I'd like to hear your thoughts on best supporting this case. > > Happy New Year! > Manu >