adoroszlai opened a new pull request, #4866:
URL: https://github.com/apache/ozone/pull/4866

   ## What changes were proposed in this pull request?
   
   `TestScmHAFinalization` intermittently reports `Found 2 leaked objects` 
(`CodecBuffer` instances).  `FinalizationStateManagerImpl` is leaking a pair of 
`CodecBuffer`s allocated for writing layout version to SCM metadata store:
   
   ```
   2023-06-08 19:50:12,664 [Finalizer] WARN  db.CodecBuffer 
(CodecBuffer.java:finalize(129)) - LEAK 1: 
org.apache.hadoop.hdds.utils.db.CodecBuffer@292f4cc4, refCnt=1, capacity=3 
allocation:
   org.apache.hadoop.hdds.utils.db.CodecBuffer.allocate(CodecBuffer.java:74)
   ...
   org.apache.hadoop.hdds.utils.db.TypedTable.putWithBatch(TypedTable.java:172)
   
org.apache.hadoop.hdds.scm.ha.SCMHADBTransactionBufferImpl.addToBuffer(SCMHADBTransactionBufferImpl.java:70)
   
org.apache.hadoop.hdds.scm.server.upgrade.FinalizationStateManagerImpl.finalizeLayoutFeatureLocal(FinalizationStateManagerImpl.java:156)
   
org.apache.hadoop.hdds.scm.server.upgrade.FinalizationStateManagerImpl.reinitialize(FinalizationStateManagerImpl.java:257)
   ...
   
org.apache.hadoop.hdds.scm.server.upgrade.FinalizationManagerImpl.reinitialize(FinalizationManagerImpl.java:147)
   
org.apache.hadoop.hdds.scm.ha.SCMHAManagerImpl.startServices(SCMHAManagerImpl.java:445)
   
org.apache.hadoop.hdds.scm.ha.SCMHAManagerImpl.reloadSCMState(SCMHAManagerImpl.java:357)
   
org.apache.hadoop.hdds.scm.ha.SCMHAManagerImpl.installCheckpoint(SCMHAManagerImpl.java:308)
   
org.apache.hadoop.hdds.scm.ha.SCMHAManagerImpl.installCheckpoint(SCMHAManagerImpl.java:258)
   
org.apache.hadoop.hdds.scm.ha.SCMStateMachine.reinitialize(SCMStateMachine.java:391)
   
org.apache.ratis.server.impl.StateMachineUpdater.reload(StateMachineUpdater.java:218)
   ```
   
   `finalizeLayoutFeatureLocal` adds these to the `transactionBuffer`, 
collecting them in a batch operation:
   
   
https://github.com/apache/ozone/blob/b479a384e854b6eb62c429801ec3ce2acfa3c160/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/upgrade/FinalizationStateManagerImpl.java#L156-L157
   
   `CodecBuffer`s are released on commit, which happens when transaction buffer 
is flushed.
   
   
https://github.com/apache/ozone/blob/b479a384e854b6eb62c429801ec3ce2acfa3c160/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RDBBatchOperation.java#L333-L339
   
   The problem is that the buffer may be closed without being flushed.  In this 
case the contents of the current batch, if any, are not committed.  (The batch 
is closed, so RocksDB objects are not leaked.)
   
   We could fix it by explicitly flushing the buffer when finalization 
completes.  However, the problem may be more generic, any metadata stored by 
SCM via the transaction buffer may be lost.  So this PR proposes to commit the 
in-progress batch operation, if any, before closing it.
   
   Also:
    * Add a `closed` flag.  Allow closing only once.  Reject operations if 
closed.
    * Add precondition for `currentBatchOperation` being non-null.
    * Replace `AtomicLong txFlushPending` with a `boolean`, since the specific 
number is not important, only `> 0` or `== 0` cases are distinguished.
    * Change `SCMMetadataStore` to `final` both in 
`SCMHADBTransactionBufferImpl` and in `StorageContainerManager`
   
   https://issues.apache.org/jira/browse/HDDS-8740
   
   ## How was this patch tested?
   
   `TestScmHAFinalization#testSnapshotFinalization` passed in 30x10 runs:
   https://github.com/adoroszlai/hadoop-ozone/actions/runs/5221131625
   
   Regular CI:
   https://github.com/adoroszlai/hadoop-ozone/actions/runs/5221128846


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to