Technoboy- opened a new pull request, #17689:
URL: https://github.com/apache/pulsar/pull/17689

   Cherry-pick #15755
   Master issue #15643, #15753
   
   ### Motivation
   
   
   Blocked at BrokerService#unloadNamespaceBundlesGracefully:
   ```
   2022-05-20T03:37:05.4960249Z "main" #1 prio=5 os_prio=0 cpu=32274.29ms 
elapsed=2566.54s tid=0x00007fd108024380 nid=0x1af8f waiting on condition  
[0x00007fd10fcd0000]
   2022-05-20T03:37:05.4960659Z    java.lang.Thread.State: WAITING (parking)
   2022-05-20T03:37:05.4961114Z         at 
jdk.internal.misc.Unsafe.park([email protected]/Native Method)
   2022-05-20T03:37:05.4961875Z         - parking to wait for  
<0x00000000cdf00010> (a java.util.concurrent.CompletableFuture$Signaller)
   2022-05-20T03:37:05.4962343Z         at 
java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:211)
   2022-05-20T03:37:05.4963171Z         at 
java.util.concurrent.CompletableFuture$Signaller.block([email protected]/CompletableFuture.java:1864)
   2022-05-20T03:37:05.4963683Z         at 
java.util.concurrent.ForkJoinPool.unmanagedBlock([email protected]/ForkJoinPool.java:3463)
   2022-05-20T03:37:05.4964169Z         at 
java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3434)
   2022-05-20T03:37:05.4964660Z         at 
java.util.concurrent.CompletableFuture.waitingGet([email protected]/CompletableFuture.java:1898)
   2022-05-20T03:37:05.4965158Z         at 
java.util.concurrent.CompletableFuture.get([email protected]/CompletableFuture.java:2072)
   2022-05-20T03:37:05.4965715Z         at 
org.apache.pulsar.broker.service.BrokerService.lambda$unloadNamespaceBundlesGracefully$21(BrokerService.java:919)
   2022-05-20T03:37:05.4966467Z         at 
org.apache.pulsar.broker.service.BrokerService$$Lambda$1164/0x0000000801527c70.accept(Unknown
 Source)
   2022-05-20T03:37:05.4966882Z         at 
java.lang.Iterable.forEach([email protected]/Iterable.java:75)
   2022-05-20T03:37:05.4967408Z         at 
org.apache.pulsar.broker.service.BrokerService.unloadNamespaceBundlesGracefully(BrokerService.java:911)
   2022-05-20T03:37:05.4968078Z         at 
org.apache.pulsar.broker.service.BrokerService.unloadNamespaceBundlesGracefully(BrokerService.java:887)
   2022-05-20T03:37:05.4968664Z         at 
org.apache.pulsar.broker.service.BrokerService.closeAsync(BrokerService.java:732)
   2022-05-20T03:37:05.4969579Z         at 
org.apache.pulsar.broker.PulsarService.closeAsync(PulsarService.java:450)
   2022-05-20T03:37:05.4970123Z         at 
org.apache.pulsar.broker.PulsarService.close(PulsarService.java:372)
   2022-05-20T03:37:05.4970720Z         at 
   ```
   
   Blocked at CoordinationServiceImpl#close
   ```
   2022-05-20T01:17:56.3359346Z "main" #1 prio=5 os_prio=0 cpu=11209.07ms 
elapsed=3506.06s tid=0x00007f9484024380 nid=0xaba waiting on condition  
[0x00007f9489edd000]
   2022-05-20T01:17:56.3361587Z    java.lang.Thread.State: WAITING (parking)
   2022-05-20T01:17:56.3363789Z         at 
jdk.internal.misc.Unsafe.park([email protected]/Native Method)
   2022-05-20T01:17:56.3366545Z         - parking to wait for  
<0x00000000cd180010> (a java.util.concurrent.CompletableFuture$Signaller)
   2022-05-20T01:17:56.3368917Z         at 
java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:211)
   2022-05-20T01:17:56.3371298Z         at 
java.util.concurrent.CompletableFuture$Signaller.block([email protected]/CompletableFuture.java:1864)
   2022-05-20T01:17:56.3373823Z         at 
java.util.concurrent.ForkJoinPool.unmanagedBlock([email protected]/ForkJoinPool.java:3463)
   2022-05-20T01:17:56.3376212Z         at 
java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3434)
   2022-05-20T01:17:56.3378608Z         at 
java.util.concurrent.CompletableFuture.waitingGet([email protected]/CompletableFuture.java:1898)
   2022-05-20T01:17:56.3380999Z         at 
java.util.concurrent.CompletableFuture.join([email protected]/CompletableFuture.java:2117)
   2022-05-20T01:17:56.3383947Z         at 
org.apache.pulsar.metadata.coordination.impl.CoordinationServiceImpl.close(CoordinationServiceImpl.java:72)
   2022-05-20T01:17:56.3386574Z         at 
org.apache.pulsar.broker.PulsarService.closeAsync(PulsarService.java:526)
   2022-05-20T01:17:56.3388569Z         at 
org.apache.pulsar.broker.PulsarService.close(PulsarService.java:372)
   ```
   
   For BrokerService#unloadNamespaceBundlesGracefully, the request chain :
   ```
   brokerService.closeAsync() -> OwnedBundle.handleUnloadRequest -> 
pulsar.getNamespaceService().getOwnershipCache().removeOwnership(bundle) -> 
OwnershipCache.removeOwnership ->
   ResourceLock.release 
   ```
   
   For CoordinationServiceImpl#close, the request chain :
   ```
   CoordinationServiceImpl.close -> LockManager.asyncClose -> 
ResourceLock.release
   ```
   We find that it's all related to ResourceLock#release.
   
   As the CI using the MockedZooKeeper, I find that if there are some 
RuntimeException, the response could never finish. So I add the catch block to 
ensure that all the requests will reply.   But I'm not sure if the return code 
is right.
   
https://github.com/apache/pulsar/blob/3a8045851f7e9ea62da104dab2b7fe2b47a95ca9/testmocks/src/main/java/org/apache/zookeeper/MockZooKeeper.java#L332-L402
   
   
https://github.com/apache/pulsar/blob/3a8045851f7e9ea62da104dab2b7fe2b47a95ca9/testmocks/src/main/java/org/apache/zookeeper/MockZooKeeper.java#L916-L976
   
   
   More, the current close process has some order issues. LoadManager is closed 
before BrokerService, but BrokerService closes need to invoke LoadManager, even 
though the LoadManager is stateless, but is a little confused here. 
   
   
https://github.com/apache/pulsar/blob/3a8045851f7e9ea62da104dab2b7fe2b47a95ca9/pulsar-broker/src/main/java/org/apache/pulsar/broker/PulsarService.java#L443-L452
   
   
https://github.com/apache/pulsar/blob/3a8045851f7e9ea62da104dab2b7fe2b47a95ca9/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java#L891-L902
    
   ### Documentation
   
   - [x] `no-need-doc` 
   (Please explain why)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to