Hey, I had the chance to explore this area of eq-deletes recently myself too. Apparently, this behavior is by design in Flink. The reason why it unconditionally writes an eq-delete too for each insert (only in upsert-mode, though) is to guarantee the uniqueness of the primary key. So it drops the previous row with the same PK with a new eq-delete and then adds the new row in a new data file. This happens unconditionally unfortunately, so no reading is happening and even if there was no row previously with the given PK, Flink will write an eq-delete for it anyway. Yes, this can hurt read performance, so I guess the advised best practice is to compact your table frequently.
Gabor On Wed, Apr 17, 2024 at 6:45 PM Aditya Gupta <adigu...@linkedin.com.invalid> wrote: > Hi all, > > > > In Flink SQL, in UPSERT mode, I have observed that if I INSERT a new > record with a new equality field Id, then a equality delete file is also > created with the corresponding entry, for example I executed following > commands in Flink SQL with Apache Iceberg- > > > > CREATE TABLE `hadoop_catalog`.`testdb`.`upsert_test1` ( > > `id` INT UNIQUE COMMENT 'unique id', > > `data` STRING NOT NULL, > > PRIMARY KEY(`id`) NOT ENFORCED > > ) with ('format-version'='2', 'write.upsert.enabled'='true'); > > > > now I inserted a record- > > > > INSERT INTO upsert_test1 VALUES (7, 'new value'); > > > > It resulted in 2 files - > > data file content- > > > > {"id":7,"data":"new value"} > > > > But it also created an equality delete file - > > > > {"id":7} > > > > I expect that it will create a delete file entry for UPDATE / DELETE but > not for INSERT as it might lead to performance degradation for reads for > CDC tables, right? > > is it expected that fresh INSERTS will also have equality delete entries ? > If yes, what is the benefit of having equality delete entry for INSERTS ? > > > > > > Regards, > > Aditya > > > > > > >