morningman commented on PR #63311:
URL: https://github.com/apache/doris/pull/63311#issuecomment-4610856871
## 1. Make the test fail on the unfixed code
This is the most important point. The new `testEraseUsesFreshCurrentTime`
**passes against both the buggy and the fixed code**, so it does not actually
verify this PR.
Why: the test calls `Thread.sleep(1500)` *before* invoking
`runAfterCatalogReady()`. The recycled table and database get `recycleTime ≈
now` at setup, and `catalog_trash_expire_second` is set to 1s. By the time
`runAfterCatalogReady()` runs and captures its timestamp — even the *stale* one
captured at method entry in the old code — roughly 1500ms have already elapsed:
```
latency = currentTimeMs(stale, at method entry) - recycleTime ≈ 1500ms >
expire(1000ms) → already expired
```
So the old shared-timestamp code also sees the table/database as expired and
erases them; every assertion still passes. This bug only manifests when an item
crosses the expiry threshold *during* the slow `erasePartition` call — a
condition this test never creates, because it crosses the threshold *before*
the daemon method even starts.
> Verification the author can run: revert the production change back to the
shared `currentTimeMs` and run only this test — it still passes. A test that
cannot fail when the business logic is broken does not prove the fix works.
**Recommended fix — abstract the time source so the test is deterministic**
(no wall-clock dependence, no need for 1000 partitions):
```java
// Production side: make the clock injectable (illustrative)
private LongSupplier clock = System::currentTimeMillis;
@VisibleForTesting
void setClock(LongSupplier clock) { this.clock = clock; }
protected void runAfterCatalogReady() {
int keepNum = Config.max_same_name_catalog_trash_num;
erasePartition(clock.getAsLong(), keepNum);
eraseTable(clock.getAsLong(), keepNum);
eraseDatabase(clock.getAsLong(), keepNum);
}
```
In the test, drive the clock so the second/third reads return a value
noticeably larger than the first, and set the table/database `recycleTime` so
they are **not expired at the first read but expired at the later reads**. That
deterministically distinguishes fixed from broken behavior.
If changing the production signature is undesirable, the alternative is to
construct the real "threshold crossed *during* `erasePartition`" condition
(remove the pre-sleep, and make `erasePartition` genuinely take longer than the
expiry window). This path is inherently flaky (see #2) and is not recommended.
## 2. [Should fix] Wall-clock–based timing makes the test flaky even once
the logic is corrected
The test assumes "erasing 1000 partitions is slow enough to cross the expiry
threshold." In a unit test, erasing 1000 **in-memory** partitions will very
likely complete well under the 1-second threshold, so the "slow operation"
premise may not hold — making the test both ineffective and dependent on
machine/load. Timing bugs tested against a real wall clock are fragile by
nature. Prefer abstracting the time source (a controllable fake clock, as in
#1) so the outcome is deterministic.
## 3. [Should fix] Test pollution — restore
`Config.max_same_name_catalog_trash_num`
The test sets `Config.max_same_name_catalog_trash_num = -1` *outside* the
`try` block and never restores it. The default is `3`, and it is a static
global, so leaving it at `-1` can cause order-dependent failures in other tests
sharing the JVM. The `finally` already restores the other two configs — add
this one too:
```java
int origMaxSameName = Config.max_same_name_catalog_trash_num;
Config.max_same_name_catalog_trash_num = -1;
...
} finally {
Config.max_same_name_catalog_trash_num = origMaxSameName;
Config.catalog_trash_expire_second = origExpireSecond;
Config.catalog_trash_ignore_min_erase_latency = origIgnoreMinErase;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]