t3hw commented on PR #14797:
URL: https://github.com/apache/iceberg/pull/14797#issuecomment-3747006047

   Pasting my comment from another discussion about this PR:
   
   Original commenter said:
   The primary issue with using DVs for this is identifying the position of a 
deleted row in the dataset.  If receive a delete record for an existing row, 
you need to scan the table to locate the file+position in order to create the 
DV.  That results is lots of scan behavior and data materialization on the 
worker, which really isn't practical/scalable.
   That issue is the main issue that equality deletes address, but equality 
deletes also introduce significant maintenance burden that isn't available in 
KC natively.  This means in order to use this approach with KC, you also need 
Spark or equivalent to aggressively maintain the deletes.
   Based on the PR, I'm not clear how these issues are addressed
   
   And the response:
   My implementation of BaseDeltaWriter inherits the insertedRowMap from the 
BaseEqualityDeltaWriter class. It deletes using the DV Writer by default, while 
falling back to Equality Deletes if the key was not found. The map is cleared 
when the batch is closed and the file gets written.
   I believe RisingWave is using a similar approach.
   
   If this gets accepted the documentation should be updated to let users know 
that compaction is highly recommended.
   The implementation includes `PartitionedDeltaWriter` which routes records to 
partition-specific writers. This means equality deletes are scoped to their 
partition - query engines only need to scan files within that partition, not 
the entire table. For well-partitioned tables (e.g., by date), this 
significantly reduces the read-time cost of equality deletes.
   Another approach would be to keep a separate index of the id-field->parquet 
file, but running compactions is required regardless, so this is kind of a 
wasted effort  as it would need to be replicated, persisted, or managed outside 
of kafka connect, and it would lose its state and have to be re-built once a 
compaction is triggered.
   I agree the PR description was misleading about "using DVs for CDC" - DVs 
only help for in-batch deduplication. I'll update the description to be clearer.
   
   Given that equality deletes + compaction is how Flink handles CDC (and is 
the industry standard), would this approach be acceptable if we:
   1. Update the PR description to accurately describe the behavior
   2. Document that periodic compaction is required for production use
   3. Recommend proper table partitioning to minimize equality delete scan 
overhead
   4. Provide example compaction commands in the docs
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to