fightBoxing opened a new pull request, #4210:
URL: https://github.com/apache/amoro/pull/4210

   ## Why are the changes needed?
   
   Closes #4006.
   
   For keyed **Mixed-Iceberg** tables managed by AMS's internal REST catalog, 
the **base store** and the **change store** share the same server-side 
`TableMetadata` record (and therefore the same `meta_version`). In 
`MixedTableMaintainer.expireSnapshots()` the change maintainer runs **before** 
the base maintainer:
   
   ```java
   if (changeMaintainer != null) {
     changeMaintainer.expireSnapshots();   // bumps the shared meta_version
   }
   baseMaintainer.expireSnapshots();       // still holds a stale Table 
reference
   ```
   
   After the change maintainer commits, the shared `meta_version` has been 
bumped on the server side, but the `baseMaintainer.table` reference was 
captured at construction time and its cached `TableMetadata` still points to 
the old version. The subsequent base commit then trips the optimistic-lock 
check and fails with `CommitFailedException`, so **base-store snapshots are 
never actually expired** — only the change store is cleaned up, causing steady 
metadata growth over time as reported in the issue.
   
   ## Brief change log
   
   ### Fix — `MixedTableMaintainer.java`
   * After `changeMaintainer.expireSnapshots()` (and its `@VisibleForTesting` 
overload) finishes, refresh the base maintainer's cached Iceberg `Table` before 
running the base expiration, so the next commit starts from the latest 
`meta_version`.
   * The refresh is wrapped in a `try/catch` that logs a warning on failure — a 
transient refresh error must not mask the original error signal coming from 
`baseMaintainer.expireSnapshots()`.
   
   ### Regression tests — `TestSnapshotExpire.java`
   1. **`testRefreshBaseTableBeforeBaseExpiration`** — wraps 
`baseMaintainer.table` with a Mockito spy (injected via reflection) and 
verifies that `refresh()` is invoked during `expireSnapshots()`. The pre-fix 
code path does not call `refresh()` inside the expire-snapshots flow, so **this 
test fails without the fix and passes with it** (verified locally by reverting 
`MixedTableMaintainer.java` to `upstream/master`).
   2. **`testExpireSnapshotsShrinksBothStores`** — end-to-end smoke test that 
drives `expireSnapshots()` on a keyed table with multiple expirable snapshots 
on both stores and asserts that both snapshot lists shrink. Acts as a 
happy-path safety net.
   
   > Note: The default `BasicCatalogTestHelper` backend used by this test class 
keeps base and change stores in **independent** Iceberg tables (no shared 
`meta_version`), so it cannot physically reproduce the server-side conflict. 
The first test therefore asserts the fix at the **behavioural** level — that 
`refresh()` is called on the base maintainer's cached table exactly where the 
fix introduces it.
   
   ## How was this patch tested?
   
   Locally with JDK 11 + Maven against the reactor:
   
   ```
   mvn -pl amoro-ams -am surefire:test \
       
-Dtest='TestSnapshotExpire#testRefreshBaseTableBeforeBaseExpiration+testExpireSnapshotsShrinksBothStores'
   ```
   
   | Scenario | `testRefreshBaseTableBeforeBaseExpiration` | 
`testExpireSnapshotsShrinksBothStores` |
   |---|---|---|
   | With the fix applied | ✅ PASS (4/4 keyed parameterised instances) | ✅ PASS 
|
   | Fix reverted to `upstream/master` | ❌ FAIL — `refresh()` never called | ✅ 
PASS (expected — HadoopCatalog does not share `meta_version`) |
   
   Final result with the fix in place:
   
   ```
   Tests run: 8, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 8.937 s
   [INFO] BUILD SUCCESS
   ```
   
   (8 = 4 parameterised configs × 2 methods; 4 skipped are non-keyed configs 
gated by `Assume.assumeTrue(isKeyedTable())`.)
   
   Also verified:
   * `mvn spotless:apply` — no format violations.
   * `mvn -pl amoro-ams -am test-compile` — `BUILD SUCCESS` across the whole 
reactor.
   
   ## Documentation
   
   Does this PR introduce any user-facing change?
   
     - [ ] Yes.
     - [x] No. Internal bug fix; no API, config, or docs change.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to