rionmonster commented on code in PR #2265:
URL: https://github.com/apache/fluss/pull/2265#discussion_r2659761272


##########
fluss-lake/fluss-lake-iceberg/src/test/java/org/apache/fluss/lake/iceberg/maintenance/IcebergRewriteITCase.java:
##########
@@ -186,6 +186,9 @@ void testLogTableCompaction() throws Exception {
                             t1, t1Bucket, ++i, true, 
Collections.singletonList(row(1, "v1"))));
             checkFileStatusInIcebergTable(t1, 3, false);
 
+            // Ensure tiering job has fully processed the previous writes
+            assertReplicaStatus(t1Bucket, i);

Review Comment:
   @luoyuxia 
   
   I added some additional checks around these to try and better identify the 
underlying issues. It doesn't look like the files are being written or 
recognized by Iceberg, which I suspect is simply a race-condition from the 
additional diagnostics:
   
   ```
   [ASSERTION FAILURE] Expected offset 3 but got 2 for bucket 
TableBucket{tableId=21, bucket=0}
     Replica Lake Snapshot ID: 6090342046054561204
     Current State:
       Iceberg Files: 2
       Iceberg Snapshot ID: 6090342046054561204
       Lake Snapshot ID (from admin): 6090342046054561204
       Replica Lake Snapshot ID: 6090342046054561204
       Replica Lake Log End Offset: 2 (expected: 3, diff: 1)
   ```
   
   Said differently, because the replica offset is off (by one) it means either 
the tiering job hasn't processed all writes yet (check monitor output above), 
the offset update is lagging behind file commits (race condition), or the 
tiering job itself may be stuck or running slowly.
   
   I think we could help resolve this in two ways, which I'm testing through 
now and will update the PR:
   1. Introduce a `waitForTieringToProcess` helper function that would check to 
verify that the tiering had been executed before continuing to validate the 
replica statuses.
   2. Reduce the test-specific freshness and/or tiering intervals (e.g., 
`POLL_TIERING_TABLE_INTERVAL`) from their current values of 500ms to 100ms 
(optional; may not be required after introducing the proposed wait helper).
   
   Thoughts?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to