rionmonster commented on code in PR #2265:
URL: https://github.com/apache/fluss/pull/2265#discussion_r2659761272
##########
fluss-lake/fluss-lake-iceberg/src/test/java/org/apache/fluss/lake/iceberg/maintenance/IcebergRewriteITCase.java:
##########
@@ -186,6 +186,9 @@ void testLogTableCompaction() throws Exception {
t1, t1Bucket, ++i, true,
Collections.singletonList(row(1, "v1"))));
checkFileStatusInIcebergTable(t1, 3, false);
+ // Ensure tiering job has fully processed the previous writes
+ assertReplicaStatus(t1Bucket, i);
Review Comment:
@luoyuxia
I added some additional checks around these to try and better identify the
underlying issues. It doesn't look like the files are being written or
recognized by Iceberg, which I suspect is simply a race-condition from the
additional diagnostics:
```
[ASSERTION FAILURE] Expected offset 3 but got 2 for bucket
TableBucket{tableId=21, bucket=0}
Replica Lake Snapshot ID: 6090342046054561204
Current State:
Iceberg Files: 2
Iceberg Snapshot ID: 6090342046054561204
Lake Snapshot ID (from admin): 6090342046054561204
Replica Lake Snapshot ID: 6090342046054561204
Replica Lake Log End Offset: 2 (expected: 3, diff: 1)
```
Said differently, because the replica offset is off (by one) it means either
the tiering job hasn't processed all writes yet (check monitor output above),
the offset update is lagging behind file commits (race condition), or the
tiering job itself may be stuck or running slowly.
I think we could help resolve this in two ways, which I'm testing through
now and will update the PR:
1. Introduce a `waitForTieringToProcess` helper function that would check to
verify that the tiering had been executed before continuing to validate the
replica statuses.
2. Reduce the test-specific freshness and/or tiering intervals (e.g.,
`POLL_TIERING_TABLE_INTERVAL`) from their current values of 500ms to 100ms.
Thoughts? I’d probably err on the side of applying both since it’s a flaky
test and the more consistency, the better.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]