aokolnychyi commented on issue #3941:
URL: https://github.com/apache/iceberg/issues/3941#issuecomment-1065427093
> Constructing pre/post images may be necessary. If the downstream is an
external system that can handle the insert as upsert, the update_before(delete)
can be dropped.
I agree pre/post images may be necessary and I think there will be a way to
build them (like I mention in the comment above). It will require more
resources and will be tricky for equality deletes. That's why the proposal is
not to include them by default. However, I am not sure I agree that pre/post
images are required to apply changes to other systems.
For example, consider the following CDC records (I ignored some columns).
```
_record_type | _commit_snapshot_id | _commit_order | id | value
-------------------------------------------------------------------
delete, s1, 0, 100, null
insert, s1, 0, 100, a
insert, s1, 0, 101, a
```
If the external system supports MERGE statements, we can issue the following
command:
```
MERGE INTO target t USING source s
ON t.id = s.id AND s._record_type = 'delete'
WHEN MATCHED
DELETE
WHEN NOT MATCHED
INSERT …
```
Even though we only had deletes/inserts, we still managed to represent an
update in the external system.
> But if the downstream requires aggregation operation like sum, the
update_before can not be dropped.
Can you elaborate a little bit more to make sure I have a good example to
think through? When I explored such cases, I though it would be sufficient to
know what partitions/keys changed so that aggregations affected by this can be
recomputed. For instance, if you have an aggregation by day and you know a
particular day has either deletes or new inserts, the aggregation must be
recomputed.
> Because in the current implementation, the records have no metadata like
create_timestamp, we can't determine the time of deletion and insertion, so
maybe we have to delete before inserting. But this is unacceptable, this will
cause data to jitter, and users will see the data decrease, and then the data
returns to normal.
Can you give an example here too? In my examples, I assumed deletes with the
same `_commit_order` are applied before inserts. Changes from different
snapshots will have different `_commit_order`.
```
_record_type | _commit_snapshot_id | _commit_order | id | value
-------------------------------------------------------------------
insert, s1, 0, 100, a
delete, s2, 1, 100, null
insert, s2, 1, 100, b
insert, s3, 2, 101, a
```
The above example would be append in `s1`, update in `s2`, another append in
`s3`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]