Hi all,
In Flink SQL, in UPSERT mode, I have observed that if I INSERT a new record
with a new equality field Id, then a equality delete file is also created with
the corresponding entry, for example I executed following commands in Flink SQL
with Apache Iceberg-
CREATE TABLE `hadoop_catalog`.`testdb`.`upsert_test1` (
`id` INT UNIQUE COMMENT 'unique id',
`data` STRING NOT NULL,
PRIMARY KEY(`id`) NOT ENFORCED
) with ('format-version'='2', 'write.upsert.enabled'='true');
now I inserted a record-
INSERT INTO upsert_test1 VALUES (7, 'new value');
It resulted in 2 files -
data file content-
{"id":7,"data":"new value"}
But it also created an equality delete file -
{"id":7}
I expect that it will create a delete file entry for UPDATE / DELETE but not
for INSERT as it might lead to performance degradation for reads for CDC
tables, right?
is it expected that fresh INSERTS will also have equality delete entries ? If
yes, what is the benefit of having equality delete entry for INSERTS ?
Regards,
Aditya