Re: [PR] [fix] [broker] Fix broker deadlock in metadata-callback thread due to single thread processing [pulsar]

via GitHub Tue, 04 Jun 2024 15:29:21 -0700


rdhabalia commented on PR #22841:
URL: https://github.com/apache/pulsar/pull/22841#issuecomment-2148514652


   Hello @merlimat 
   > I don't think this is caused by having a single-thread processing the 
metadata-store callbacks (which should be typically solved by avoiding mixing 
sync/async calls).
   
   That's correct but it's difficult to avoid such mixing if it comes as a part 
of any regression or new development. Currently it's happening due to sync call 
at `BookieRackAffinityMapping` which blocking-call I tried to fix in #22842 and 
#22843 . but this issue can be also mitigated if we would have multiple 
metadata-callback threads and I tried to simulate it in unit-case as well where 
unit-test case goes in deadlock and test fails with single metadata thread 
callback.
   
   > It would be useful to see which thread is holding that lock:
   > - parking to wait for  <0x00001000f838ac60> (a 
java.util.concurrent.CompletableFuture$Signaller)
   
   
   no one is holding `0x00001000f838ac60` because that's where 
[BookieRackAffinityMapping](https://github.com/apache/pulsar/blob/branch-3.3/pulsar-broker-common/src/main/java/org/apache/pulsar/bookie/rackawareness/BookieRackAffinityMapping.java#L122)
 waiting for the `CompletableFuture` to be completed. Please find [full jstack 
for more 
info](https://gist.github.com/rdhabalia/f50c7b7b86a78427194603072d3b7666)
   ```
    at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
           - parking to wait for  <0x00001000f838ac60> (a 
java.util.concurrent.CompletableFuture$Signaller)
           at 
java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:194)
           at 
java.util.concurrent.CompletableFuture$Signaller.block([email protected]/CompletableFuture.java:1796)
           at 
java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3128)
           at 
java.util.concurrent.CompletableFuture.waitingGet([email protected]/CompletableFuture.java:1823)
           at 
java.util.concurrent.CompletableFuture.get([email protected]/CompletableFuture.java:1998)
           at 
org.apache.pulsar.bookie.rackawareness.BookieRackAffinityMapping.setConf(BookieRackAffinityMapping.java:121)
   ```
   
   
   > In this case it looks just a deadlock in the BookieRackAffinityMapping 
code. The blocking of the metadata-callback thread is purely a side effect of 
the ManagedLedgerFactoryImpl map lock being held by the other thread.
   
   well, overall issue was:: metadata-store callback thread is blocked on topic 
creation, and other thread is blocked on different topic creation which is 
waiting on metadata-result's completableFuture and that will be only completed 
if metadata-store callback gets free. and that creates a deadlock.
   
   Please find below stacktrace for better understanding:
   
   1. `metadata-store` thread was trying to create ledger and held a lock into 
`ManagedLedgerFactoryImpl` 
   ```
   "metadata-store-10-1" #25 prio=5 os_prio=0 cpu=7574.38ms elapsed=703.54s 
tid=0x00007fafdd884ed0 nid=0x37eb waiting for monitor entry  
[0x00007fae8f9d3000]
      java.lang.Thread.State: BLOCKED (on object monitor)
        at 
java.util.concurrent.ConcurrentHashMap.computeIfAbsent([email protected]/ConcurrentHashMap.java:1723)
        - waiting to lock <0x00001000f4a02bc0> (a 
java.util.concurrent.ConcurrentHashMap$ReservationNode)
        at 
org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.asyncOpen(ManagedLedgerFactoryImpl.java:365)
   ```
   
   2. At the same time, other thread `pulsar-io` was also trying to create a 
new topic which required `ManagedLedgerClientFactory` to create a new BK-Client 
and bk-client has a blocking call on metadatastore to get rack-info and waiting 
on CompletableFuture of metadatastore which is not being completed as 
metadatastore has only 1 thread and its waiting on step-1.
   ```
   "pulsar-io-4-32" #166 prio=5 os_prio=0 cpu=429.42ms elapsed=701.77s 
tid=0x00007facc80314b0 nid=0x3881 waiting on condition  [0x00007facc73f4000]
      java.lang.Thread.State: WAITING (parking)
           at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
           - parking to wait for  <0x00001000f838ac60> (a 
java.util.concurrent.CompletableFuture$Signaller)
           at 
java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:194)
           at 
java.util.concurrent.CompletableFuture$Signaller.block([email protected]/CompletableFuture.java:1796)
           at 
java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3128)
           at 
java.util.concurrent.CompletableFuture.waitingGet([email protected]/CompletableFuture.java:1823)
           at 
java.util.concurrent.CompletableFuture.get([email protected]/CompletableFuture.java:1998)
           at 
org.apache.pulsar.bookie.rackawareness.BookieRackAffinityMapping.setConf(BookieRackAffinityMapping.java:121)
           - locked <0x00001000f8389698> (a 
org.apache.pulsar.bookie.rackawareness.BookieRackAffinityMapping)
           at 
org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.initialize(RackawareEnsemblePlacementPolicyImpl.java:265)
           at 
org.apache.pulsar.bookie.rackawareness.IsolatedBookieEnsemblePlacementPolicy.initialize(IsolatedBookieEnsemblePlacementPolicy.java:105)
           at 
org.apache.pulsar.bookie.rackawareness.IsolatedBookieEnsemblePlacementPolicy.initialize(IsolatedBookieEnsemblePlacementPolicy.java:51)
           at 
org.apache.bookkeeper.client.BookKeeper.initializeEnsemblePlacementPolicy(BookKeeper.java:581)
           at 
org.apache.bookkeeper.client.BookKeeper.<init>(BookKeeper.java:505)
           at 
org.apache.bookkeeper.client.BookKeeper$Builder.build(BookKeeper.java:306)
           at 
org.apache.pulsar.broker.BookKeeperClientFactoryImpl.create(BookKeeperClientFactoryImpl.java:87)
           at 
org.apache.pulsar.broker.ManagedLedgerClientFactory.lambda$initialize$0(ManagedLedgerClientFactory.java:95)
           at 
org.apache.pulsar.broker.ManagedLedgerClientFactory$$Lambda$1006/0x00007facb41b1440.apply(Unknown
 Source)
           at 
java.util.concurrent.ConcurrentHashMap.computeIfAbsent([email protected]/ConcurrentHashMap.java:1705)
           - locked <0x00001000f3c03a28> (a 
java.util.concurrent.ConcurrentHashMap$ReservationNode)
           at 
org.apache.pulsar.broker.ManagedLedgerClientFactory.lambda$initialize$1(ManagedLedgerClientFactory.java:93)
           at 
org.apache.pulsar.broker.ManagedLedgerClientFactory$$Lambda$202/0x00007fae1e73e9b0.get(Unknown
 Source)
           at 
org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.lambda$asyncOpen$6(ManagedLedgerFactoryImpl.java:369)
           at 
org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl$$Lambda$660/0x00007facb91ecd60.apply(Unknown
 Source)
           at 
java.util.concurrent.ConcurrentHashMap.computeIfAbsent([email protected]/ConcurrentHashMap.java:1705)
           - locked <0x00001000f4a02bc0> (a 
java.util.concurrent.ConcurrentHashMap$ReservationNode)
           at 
org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.asyncOpen(ManagedLedgerFactoryImpl.java:365)
   ```
   
   
   I have added unit-test which simulate the similar behavior. Our brokers were 
keep going in this deadlock situation and becoming irresponsive. Therefore, we 
deployed this quick fix and this issue didn't happen again. Therefore, I have 
created this fix to mitigate any future regression issue and I have also fixed 
mixing of sync/async call in other PRs.
   
   Attached stacktrace is from broker-2.10 but it happens with broker-3.3 as 
well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [fix] [broker] Fix broker deadlock in metadata-callback thread due to single thread processing [pulsar]

Reply via email to