lhotari opened a new pull request, #21496:
URL: https://github.com/apache/pulsar/pull/21496

   ### Motivation
   
   PulsarService shutdown leaks threads in tests when execution is slow enough 
that the thread gets interrupted before the second phase of the shutdown is 
executed. It is hard to reproduce this issue on an ordinary developer machine 
because execution completes so quickly that the path where the thread has 
already been interrupted doesn't happen. In CI, there's 2 cores and 4 test 
processes running concurrently and the problem occurs fairly often. For 
example, the thread leak detector reported that ServerCnxTest leaked 144 
threads [in this 
build](https://github.com/apache/pulsar/actions/runs/6710630191/job/18242485629#step:16:201):
   ```
   Tests in class ServerCnxTest created thread id 10950 with name 
'Thread[broker-topic-workers-OrderedExecutor-0-0,5,main]'
   Tests in class ServerCnxTest created thread id 10951 with name 
'Thread[broker-topic-workers-OrderedExecutor-1-0,5,main]'
   Tests in class ServerCnxTest created thread id 10952 with name 
'Thread[pulsar-stats-updater-OrderedScheduler-0-0,5,main]'
   Tests in class ServerCnxTest created thread id 10953 with name 
'Thread[pulsar-inactivity-monitor-OrderedScheduler-0-0,5,main]'
   Tests in class ServerCnxTest created thread id 10954 with name 
'Thread[pulsar-msg-expiry-monitor-OrderedScheduler-0-0,5,main]'
   Tests in class ServerCnxTest created thread id 10955 with name 
'Thread[pulsar-compaction-monitor-OrderedScheduler-0-0,5,main]'
   Tests in class ServerCnxTest created thread id 10956 with name 
'Thread[pulsar-consumed-ledgers-monitor-OrderedScheduler-0-0,5,main]'
   Tests in class ServerCnxTest created thread id 10957 with name 
'Thread[pulsar-backlog-quota-checker-OrderedScheduler-0-0,5,main]'
   Tests in class ServerCnxTest created thread id 11122 with name 
'Thread[broker-topic-workers-OrderedExecutor-0-0,5,main]'
   Tests in class ServerCnxTest created thread id 11123 with name 
'Thread[broker-topic-workers-OrderedExecutor-1-0,5,main]'
   Tests in class ServerCnxTest created thread id 11124 with name 
'Thread[pulsar-stats-updater-OrderedScheduler-0-0,5,main]'
   Tests in class ServerCnxTest created thread id 11125 with name 
'Thread[pulsar-inactivity-monitor-OrderedScheduler-0-0,5,main]'
   ...
   ...
   ...
   Warning: Summary: Tests in class 
org.apache.pulsar.broker.service.ServerCnxTest created 144 new threads. There 
are now 154 threads in total.
   ```
   
   ### Modifications
   
   - Adding the future cancellation timeout logic doesn't make sense when 
`brokerShutdownTimeoutMs=0`. Skip adding the timeout logic and refactor the 
method. 
   - Running the second phase on the completion thread blocks a Netty thread. 
It's better to run the 2nd phase of the shutdown in a new separate thread.
   
   ### Documentation
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   - [ ] `doc` <!-- Your PR contains doc changes. -->
   - [ ] `doc-required` <!-- Your PR changes impact docs and you will update 
later -->
   - [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
   - [ ] `doc-complete` <!-- Docs have been already added -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to