Thanks xianjin. It's working now. I also created a PR to enhance the documentation https://github.com/apache/iceberg/pull/9478
Thanks, Manu On Thu, Jan 11, 2024 at 11:08 AM xianjin <xian...@apache.org> wrote: > You can create an Iceberg table with required field, for example: > > create table test_table (id bigint not null, data string) using iceberg > > > However you can not change the optional field to required after creation. > See this issue for more details: > https://github.com/apache/iceberg/issues/3617 > > Manu Zhang <owenzhang1...@gmail.com> 于2024年1月11日周四 10:08写道: > >> It looks like there's no way to explicitly add a required column in DDL. >> Any suggestions? >> >> Much appreciated >> Manu >> >> On Tue, Jan 9, 2024 at 3:37 PM Manu Zhang <owenzhang1...@gmail.com> >> wrote: >> >>> Thanks Peter and Ryan for the info. >>> >>> As identifier fields need to be "required", how can I alter an optional >>> column to be required in Spark SQL? >>> >>> Thanks, >>> Manu >>> >>> On Fri, Jan 5, 2024 at 12:50 AM Ryan Blue <b...@tabular.io> wrote: >>> >>>> You can set the primary key fields in Spark using `ALTER TABLE`: >>>> >>>> `ALTER TABLE t SET IDENTIFIER FIELDS id` >>>> >>>> Spark doesn't support any primary key syntax, so you have to do this as >>>> a separate step. >>>> >>>> On Thu, Jan 4, 2024 at 8:46 AM Péter Váry <peter.vary.apa...@gmail.com> >>>> wrote: >>>> >>>>> Hi Manu, >>>>> >>>>> The Iceberg Schema defines `identifierFieldIds` method [1], and Flink >>>>> uses that as the primary key. >>>>> Are you saying there is no way to set it in Spark and Trino? >>>>> >>>>> Thanks, >>>>> Peter >>>>> >>>>> [1] >>>>> https://github.com/apache/iceberg/blob/9a00f7477dedac4501fb2de9e1e6d7aa83dc20b7/api/src/main/java/org/apache/iceberg/Schema.java#L280 >>>>> >>>>> Manu Zhang <owenzhang1...@gmail.com> ezt írta (időpont: 2024. jan. >>>>> 4., Cs, 16:45): >>>>> >>>>>> Hi all, >>>>>> >>>>>> Currently, we support upserting a Flink created table with Flink SQL >>>>>> where primary keys are required as equality fields. They are not required >>>>>> in Java API. >>>>>> >>>>>> However, if the table is created by Spark, where there's no primary >>>>>> key, we cannot upsert with Flink SQL. Hence, I proposed >>>>>> https://github.com/apache/iceberg/pull/8195 to support specifying >>>>>> equality columns with Flink SQL write options. >>>>>> >>>>>> @pvary <https://github.com/pvary> suggested it would be better to >>>>>> support primary keys in Spark, Trino, etc. Since these engines don't have >>>>>> primary keys in their table definitions, a workaround is to put primary >>>>>> key >>>>>> columns in table properties. Maybe there are other options I've missed. >>>>>> >>>>>> Flink SQL sinking to Spark tables for analysis is a typical pipeline >>>>>> in our datalake. I'd like to hear your thoughts on best supporting this >>>>>> case. >>>>>> >>>>>> Happy New Year! >>>>>> Manu >>>>>> >>>>> >>>> >>>> -- >>>> Ryan Blue >>>> Tabular >>>> >>>