hameizi commented on a change in pull request #3834:
URL: https://github.com/apache/iceberg/pull/3834#discussion_r803328630
##########
File path: core/src/main/java/org/apache/iceberg/io/BaseTaskWriter.java
##########
@@ -120,12 +120,7 @@ public void write(T row) throws IOException {
// Create a copied key from this row.
StructLike copiedKey =
StructCopy.copy(structProjection.wrap(asStructLike(row)));
- // Adding a pos-delete to replace the old path-offset.
- PathOffset previous = insertedRowMap.put(copiedKey, pathOffset);
- if (previous != null) {
- // TODO attach the previous row if has a positional-delete row schema
in appender factory.
- posDeleteWriter.delete(previous.path, previous.rowOffset, null);
- }
+ insertedRowMap.put(copiedKey, pathOffset);
Review comment:
@rdblue After test i think this is necessary, and i think old logic is
error. Because old logic only write pos-delete in `write` function what should
happen in `delete` function. And below test case is also puzzle:
(https://github.com/apache/iceberg/blob/9b6b5e0d2e760694e2abef73fa9036d1d8bbd014/data/src/test/java/org/apache/iceberg/io/TestTaskEqualityDeltaWriter.java#L324),
this test write duplicate key return just one line but user should avoid write
duplicate key depend on config `write.upsert.enable`
https://github.com/apache/iceberg/pull/2863 to execute delete semantic but not
write sementic(this is error sementic) when there is duplicate inserts. So this
PR move pos-delete logic in write function to delete function.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]