UrsSchoenenbergerNu commented on issue #15846:
URL: https://github.com/apache/iceberg/issues/15846#issuecomment-4168912482

   We believe we have a test case reproducing this issue in 
`TestIcebergFilesCommitter`:
   
   ```java
   @TestTemplate
     public void testLateWriteResultsForFailedCheckpointCausesDataLoss() throws 
Exception {
       long timestamp = 0;
       JobID jobId = new JobID();
       OperatorID operatorId;
       try (OneInputStreamOperatorTestHarness<FlinkWriteResult, Void> harness =
                    createStreamSink(jobId)) {
         harness.setup();
         harness.open();
         operatorId = harness.getOperator().getOperatorID();
   
         assertSnapshotSize(0);
         assertMaxCommittedCheckpointId(jobId, operatorId, -1L);
   
         RowData rowA = SimpleDataUtil.createRowData(1, "writer-2-early");
         DataFile dataFileA = writeDataFile("data-A", ImmutableList.of(rowA));
   
         RowData rowB = SimpleDataUtil.createRowData(2, "writer-1-late");
         DataFile dataFileB = writeDataFile("data-B", ImmutableList.of(rowB));
   
         RowData rowC = SimpleDataUtil.createRowData(3, "data-cp2");
         DataFile dataFileC = writeDataFile("data-C", ImmutableList.of(rowC));
   
         long cp1 = 1;
         long cp2 = 2;
   
         harness.processElement(of(cp1, dataFileA), ++timestamp);
         // checkpoint barrier for checkpoint 1 arrives from one upstream
         harness.snapshot(cp1, ++timestamp);
         harness.processElement(of(cp1, dataFileB), ++timestamp);
   
         // checkpoint 1 times out: notifyCheckpointComplete(1) is never called.
         harness.processElement(of(cp2, dataFileC), ++timestamp);
         harness.snapshot(cp2, ++timestamp);
         harness.notifyOfCompletedCheckpoint(cp2);
   
         // this assertion now fails: rowA is not present in the table
         SimpleDataUtil.assertTableRows(table, ImmutableList.of(rowA, rowB, 
rowC), branch);
         assertMaxCommittedCheckpointId(jobId, operatorId, cp2);
       }
     }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to