hello ICEBERG dev team, We have been trying to setup a CDC pipeline on kafka connect to push data to our data lake in AWS with glue as catalog and also schema registry.
There is a topic rename in Mysql debezium connector to a single topic per tenant. We did a sample test with a single table with PK column=id. The schema is in glue correctly has identified PK=id. But when in iceberg s3 tables, the schema does not have the PK identified. Even with following 2 properties as defined in https://github.com/apache/iceberg/blob/c4ba60d27b02d8618621ad701e52d51b9a98d0d5/docs/docs/kafka-connect.md iceberg.tables.default-id-columns=id OR iceberg.table.<*table-name*>.id-columns=id data written to iceberg table is always append mode. It does NOT do an upsert for an update. Could you please let us know if iceberg supports updates and deletes in CDC pipelines? Any information on how to setup source and sink ? Spent a lot of time with AI tools already. rajans
