Copilot commented on code in PR #9869:
URL: https://github.com/apache/ozone/pull/9869#discussion_r2925545775
##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotCache.java:
##########
@@ -323,8 +329,14 @@ private UncheckedAutoCloseableSupplier<OMLockDetails>
lock(Supplier<OMLockDetail
AtomicReference<OMLockDetails> lockDetails = new
AtomicReference<>(emptyLockFunction.get());
if (lockDetails.get().isLockAcquired()) {
- if (!cleanupFunction.get()) {
+ try {
+ if (!cleanupFunction.get()) {
+ lockDetails.set(emptyUnlockFunction.get());
+ throw new IllegalStateException("Failed to acquire lock as cleanup
did not drain the cache.");
+ }
Review Comment:
`lock(...)` now throws an IllegalStateException when cleanup does not fully
drain the cache. Since this is a public API and some callers (e.g., checkpoint
streaming) may not be expecting a runtime exception here, it would be safer to
either (a) document this behavior explicitly in the `lock()`/`lock(UUID)`
javadocs, or (b) preserve the previous contract by returning non-acquired lock
details instead of throwing.
##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotCache.java:
##########
@@ -377,26 +389,25 @@ private synchronized Void cleanup(UUID evictionKey,
boolean expectKeyToBePresent
}
dbMap.compute(evictionKey, (k, v) -> {
- pendingEvictionQueue.remove(k);
+ ReferenceCounted<OmSnapshot> result = null;
if (v == null) {
- throw new IllegalStateException("SnapshotId '" + k + "' does not exist
in cache. The RocksDB " +
- "instance of the Snapshot may not be closed properly.");
- }
-
- if (v.getTotalRefCount() > 0) {
+ LOG.warn("SnapshotId '{}' does not exist in cache. The RocksDB " +
+ "instance of the Snapshot may not be closed properly.", k);
Review Comment:
`cleanup()` now treats a missing cache entry as non-fatal (to handle stale
eviction keys), but still logs it at WARN with a message implying a RocksDB
resource leak. Since this case is now expected/benign (see the new
stale-eviction-key test), consider lowering this to DEBUG/INFO and/or updating
the message to reflect that the key may have been invalidated and already
closed.
```suggestion
LOG.debug("SnapshotId '{}' is not present in cache during cleanup; "
+ "it may have already been invalidated, closed, and removed.",
k);
```
##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotCache.java:
##########
@@ -323,8 +329,14 @@ private UncheckedAutoCloseableSupplier<OMLockDetails>
lock(Supplier<OMLockDetail
AtomicReference<OMLockDetails> lockDetails = new
AtomicReference<>(emptyLockFunction.get());
if (lockDetails.get().isLockAcquired()) {
- if (!cleanupFunction.get()) {
+ try {
+ if (!cleanupFunction.get()) {
+ lockDetails.set(emptyUnlockFunction.get());
+ throw new IllegalStateException("Failed to acquire lock as cleanup
did not drain the cache.");
+ }
+ } catch (Throwable t) {
lockDetails.set(emptyUnlockFunction.get());
+ throw t;
Review Comment:
In the cleanup-not-drained path, the write lock is released twice: once
inside the `if (!cleanupFunction.get())` block and again in the surrounding
`catch (Throwable t)` (which also runs for the same thrown
IllegalStateException). This can lead to double-unlock behavior and potentially
corrupt lock state / metrics. Consider removing the unlock call inside the `if`
block and letting the catch/finally handle releasing exactly once (or track a
boolean to ensure unlock happens once).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]