CRZbulabula opened a new pull request, #1466:
URL: https://github.com/apache/ratis/pull/1466

   ## Summary
   
   Fixes [RATIS-2529](https://issues.apache.org/jira/browse/RATIS-2529): gRPC 
worker threads permanently inflate to ``availableProcessors * 2`` after 
follower restart catch-up.
   
   - Add ``raft.grpc.server.worker.event-loop.threads`` and 
``raft.grpc.client.worker.event-loop.threads`` (default ``0`` = current gRPC 
behavior).
   - When ``> 0``, build a dedicated ``EpollEventLoopGroup`` (or 
``NioEventLoopGroup`` when Epoll is unavailable) of that size and wire it into 
both the server ``NettyServerBuilder``s and the client / server-to-server 
``NettyChannelBuilder``s.
   - A single boss + worker group is shared across the admin / client / server 
``NettyServerBuilder``s and the ``GrpcServerProtocolClient`` instances managed 
by one ``GrpcServicesImpl``; both groups are shut down in ``closeImpl()``.
   
   This lets operators cap the worker thread count (e.g. 4–8) so a follower 
catch-up burst can't permanently expand the shared gRPC default 
``EventLoopGroup`` (which never shrinks its threads once started).
   
   ## Test plan
   
   - [x] New unit test ``TestGrpcEventLoops`` covers the helper (thread count, 
channel-type detection, config-key roundtrip, null shutdown).
   - [x] New integration test ``TestGrpcWorkerEventLoopThreads`` brings up a 
3-node ``MiniRaftClusterWithGrpc`` with capped server (2) and client (1) worker 
threads and asserts a client request succeeds.
   - [x] Existing ``TestCustomGrpcServices``, ``TestGrpcFactory``, 
``TestLeaderInstallSnapshotWithGrpc``, ``TestLinearizableReadWithGrpc``, 
``TestGroupInfoWithGrpc`` all pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to