lhotari opened a new pull request #14538:
URL: https://github.com/apache/pulsar/pull/14538


   Fixes #13964
   
   ### Motivation
   
   Closing a Pulsar client gets blocked or leaves unclosed resources behind in 
CI. 
   This causes builds to timeout. Similar problems could happen in production 
code that closing the client blocks or leaves resources behind if the closing 
of a producer or a consumer completes with an exception.
   
   see #13964 for issue report about closing getting blocked.
   
   Examples:
   
   PulsarClientImpl.shutdownEventLoopGroup stuck:
   
   [thread dump from stalled build 
job](https://github.com/apache/pulsar/runs/5389016985?check_suite_focus=true#step:12:1341)
   ```
   "main" #1 prio=5 os_prio=0 cpu=49933.58ms elapsed=347.19s 
tid=0x00007f86b4028000 nid=0x1f5ee in Object.wait()  [0x00007f86bbfb2000]
      java.lang.Thread.State: WAITING (on object monitor)
           at java.lang.Object.wait([email protected]/Native Method)
           - waiting on <no object reference available>
           at java.lang.Object.wait([email protected]/Object.java:328)
           at 
io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:253)
           - waiting to re-lock in wait() <0x00000000d4c60d18> (a 
io.netty.util.concurrent.DefaultPromise)
           at 
io.netty.util.concurrent.DefaultPromise.get(DefaultPromise.java:337)
           at 
org.apache.pulsar.client.impl.PulsarClientImpl.shutdownEventLoopGroup(PulsarClientImpl.java:821)
           at 
org.apache.pulsar.client.impl.PulsarClientImpl.shutdown(PulsarClientImpl.java:767)
           at 
org.apache.pulsar.broker.auth.MockedPulsarServiceBaseTest.internalCleanup(MockedPulsarServiceBaseTest.java:203)
           at 
org.apache.pulsar.broker.intercept.BrokerInterceptorTest.teardown(BrokerInterceptorTest.java:99)
   ```
   
   
   ### Modifications
   
   - don't terminate the closing sequence when exceptions occur in closing 
producers and consumers 
     - log exceptions when closing producers and consumers
   - add timeout handling for waiting for the closing of producers and 
consumers to complete
   - add timeout for shutting down eventLoopGroup


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to