aokolnychyi commented on issue #3941:
URL: https://github.com/apache/iceberg/issues/3941#issuecomment-1065427093


   > Constructing pre/post images may be necessary. If the downstream is an 
external system that can handle the insert as upsert, the update_before(delete) 
can be dropped.
   
   I agree pre/post images may be necessary and I think there will be a way to 
build them (like I mention in the comment above). It will require more 
resources and will be tricky for equality deletes. That's why the proposal is 
not to include them by default. However, I am not sure I agree that pre/post 
images are required to apply changes to other systems.
   
   For example, consider the following CDC records (I ignored some columns).
   
   ```
    _record_type | _commit_snapshot_id | _commit_order | id | value
   -------------------------------------------------------------------
   delete, s1, 0, 100, null
   insert, s1, 0, 100, a
   insert, s1, 0, 101, a
   ```
   
   If the external system supports MERGE statements, we can issue the following 
command:
   
   ```
   MERGE INTO target t USING source s
   ON t.id = s.id AND s._record_type = 'delete'
   WHEN MATCHED
     DELETE
   WHEN NOT MATCHED
     INSERT …
   ```
   
   Even though we only had deletes/inserts, we still managed to represent an 
update in the external system.
   
   > But if the downstream requires aggregation operation like sum, the 
update_before can not be dropped.
   
   Can you elaborate a little bit more to make sure I have a good example to 
think through? When I explored such cases, I though it would be sufficient to 
know what partitions/keys changed so that aggregations affected by this can be 
recomputed. For instance, if you have an aggregation by day and you know a 
particular day has either deletes or new inserts, the aggregation must be 
recomputed.
   
   > Because in the current implementation, the records have no metadata like 
create_timestamp, we can't determine the time of deletion and insertion, so 
maybe we have to delete before inserting. But this is unacceptable, this will 
cause data to jitter, and users will see the data decrease, and then the data 
returns to normal.
   
   Can you give an example here too? In my examples, I assumed deletes with the 
same `_commit_order` are applied before inserts. Changes from different 
snapshots will have different `_commit_order`.
   
   ```
    _record_type | _commit_snapshot_id | _commit_order | id | value
   -------------------------------------------------------------------
   insert, s1, 0, 100, a
   delete, s2, 1, 100, null
   insert, s2, 1, 100, b
   insert, s3, 2, 101, a
   ```
   
   The above example would be append in `s1`, update in `s2`, another append in 
`s3`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to