slfan1989 commented on PR #1267:
URL: https://github.com/apache/ratis/pull/1267#issuecomment-2896262500

   @szetszwo @adoroszlai While upgrading the ratis-examples module, I 
discovered an issue where the workerGroup might not be properly shut down, 
causing unit tests to fail with the following exception.
   
   ```
   2025-05-21 05:09:14,262 WARN  util.LeakDetector 
(LeakDetector.java:assertNoLeaks(175)) - 29/30) numLeaks == 1 > 0, will wait 
and retry ...
   2025-05-21 05:09:15,265 WARN  util.ReferenceCountedLeakDetector 
(ReferenceCountedLeakDetector.java:logLeakMessage(168)) - LEAK: (class 
org.apache.ratis.thirdparty.io.netty.channel.nio.NioEventLoopGroup, count=3, 
value=org.apache.ratis.thirdparty.io.netty.channel.nio.NioEventLoopGroup@756cf158)
   
   java.lang.IllegalStateException: #leaks = 1 > 0, #leaks == set.size = 1
   
        at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:77)
        at 
org.apache.ratis.util.LeakDetector$LeakTrackerSet.assertNoLeaks(LeakDetector.java:100)
        at 
org.apache.ratis.util.LeakDetector$LeakTrackerSet.getNumLeaks(LeakDetector.java:94)
        at 
org.apache.ratis.util.LeakDetector.assertNoLeaks(LeakDetector.java:178)
        at 
org.apache.ratis.server.impl.MiniRaftCluster.shutdown(MiniRaftCluster.java:892)
        at 
org.apache.ratis.grpc.MiniRaftClusterWithGrpc.shutdown(MiniRaftClusterWithGrpc.java:97)
        at 
org.apache.ratis.examples.filestore.FileStoreStreamingBaseTest.testFileStoreStreamSingleFile(FileStoreStreamingBaseTest.java:83)
        at java.base/java.lang.reflect.Method.invoke(Method.java:569)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
   ```
   
   After debugging and analysis, I found that the issue was caused by the 
`workerGroup` in `NettyClientStreamRpc` not releasing its resources.
   
   
https://github.com/apache/ratis/blob/0557974fa2d7409e9f5089a780a4b6024c53ec99/ratis-netty/src/main/java/org/apache/ratis/netty/client/NettyClientStreamRpc.java#L234-L242
   
   This code seems reasonable to me—its asynchronous execution can improve 
performance, but there is indeed a risk that resources may not be fully 
released.
   
   I added a small piece of code that, after the connection is closed, checks 
whether the connection's `workerGroup` has been terminated. If it has not, I 
call the `shutdownGracefully` method.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to