It looks like there's no way to explicitly add a required column in DDL. Any suggestions?
Much appreciated Manu On Tue, Jan 9, 2024 at 3:37 PM Manu Zhang <owenzhang1...@gmail.com> wrote: > Thanks Peter and Ryan for the info. > > As identifier fields need to be "required", how can I alter an optional > column to be required in Spark SQL? > > Thanks, > Manu > > On Fri, Jan 5, 2024 at 12:50 AM Ryan Blue <b...@tabular.io> wrote: > >> You can set the primary key fields in Spark using `ALTER TABLE`: >> >> `ALTER TABLE t SET IDENTIFIER FIELDS id` >> >> Spark doesn't support any primary key syntax, so you have to do this as a >> separate step. >> >> On Thu, Jan 4, 2024 at 8:46 AM Péter Váry <peter.vary.apa...@gmail.com> >> wrote: >> >>> Hi Manu, >>> >>> The Iceberg Schema defines `identifierFieldIds` method [1], and Flink >>> uses that as the primary key. >>> Are you saying there is no way to set it in Spark and Trino? >>> >>> Thanks, >>> Peter >>> >>> [1] >>> https://github.com/apache/iceberg/blob/9a00f7477dedac4501fb2de9e1e6d7aa83dc20b7/api/src/main/java/org/apache/iceberg/Schema.java#L280 >>> >>> Manu Zhang <owenzhang1...@gmail.com> ezt írta (időpont: 2024. jan. 4., >>> Cs, 16:45): >>> >>>> Hi all, >>>> >>>> Currently, we support upserting a Flink created table with Flink SQL >>>> where primary keys are required as equality fields. They are not required >>>> in Java API. >>>> >>>> However, if the table is created by Spark, where there's no primary >>>> key, we cannot upsert with Flink SQL. Hence, I proposed >>>> https://github.com/apache/iceberg/pull/8195 to support specifying >>>> equality columns with Flink SQL write options. >>>> >>>> @pvary <https://github.com/pvary> suggested it would be better to >>>> support primary keys in Spark, Trino, etc. Since these engines don't have >>>> primary keys in their table definitions, a workaround is to put primary key >>>> columns in table properties. Maybe there are other options I've missed. >>>> >>>> Flink SQL sinking to Spark tables for analysis is a typical pipeline in >>>> our datalake. I'd like to hear your thoughts on best supporting this case. >>>> >>>> Happy New Year! >>>> Manu >>>> >>> >> >> -- >> Ryan Blue >> Tabular >> >