adoroszlai opened a new pull request, #4269:
URL: https://github.com/apache/ozone/pull/4269

   ## What changes were proposed in this pull request?
   
   Attempt to fix `ManagedChannelImpl` leaks related to EC.
   
   1. Revert parts of the previous attempt (#4215).  Rationale: Invalidating 
the client removes it from the cache.  Doing so upon `close()` defeats the 
purpose of the cache.
      * However, keep `cleanup` call in `BlockOutputStream#close` regardless of 
`bufferPool` is empty or not.
   2. Fixes in prod. code:
      * `ECReconstructionCoordinator` did not close output streams if error 
happened during write (`try-finally` only covered `putBlock`, not `write`)
      * `ECKeyOutputStream#markStreamClosed` (called on exceptions) cleared the 
list of output stream entries without closing them.  It also set the `closed` 
flag, which prevented the real `close()` from doing any cleanup.
      * `BlockOutputStreamEntry#cleanup` initializes stream if needed 
(`checkStream`), then cleans it up.  `ECBlockOutputStreamEntry` overrides 
`checkStream` to create multiple streams but cleanup in parent class still 
happens for only one stream.  Fix it by overriding `cleanup`.
      * `BlockInputStream#acquireClient` did not check if client is already 
created, hence may leak previous one if called multiple times.
      * `BlockInputStream#getChunkInfos` released client only on failure, 
delaying release until `close()` on happy path.  Since the client is no longer 
needed in any case, it can be released early.
      * Fix the mismatch in `BlockInputStream`, where the client was acquired 
for "read", but released for "write".  Since `invalidateClient` is false, this 
does not make any real difference.
   3. Some fixes in test code to close streams/clients (part of that is from 
the previous patch).  The warnings due to these unclosed streams/clients may 
hide real problems.  There are many-many more, filed HDDS-7958 to keep this 
patch manageable in size.
   
   https://issues.apache.org/jira/browse/HDDS-7931
   
   ## How was this patch tested?
   
   Ran several tests:
   
   ```
   mvn clean test -am -pl :ozone-integration-test 
-Dtest='Test*OutputStream*,Test*InputStream*,TestECContainerRecovery,TestOzoneRpcClient,TestSecureOzoneRpcClient,TestContainerCommandsEC,TestOzoneFileSystem,TestRootedOzoneFileSystem'
   ```
   
   Verified no shutdown warning about `ManagedChannelImpl` related to 
`XceiverClientGrpc` appears in the logs.
   
   Regular CI:
   https://github.com/adoroszlai/hadoop-ozone/actions/runs/4143279616


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to