rdhabalia commented on PR #22841: URL: https://github.com/apache/pulsar/pull/22841#issuecomment-2148514652
Hello @merlimat > I don't think this is caused by having a single-thread processing the metadata-store callbacks (which should be typically solved by avoiding mixing sync/async calls). That's correct but it's difficult to avoid such mixing if it comes as a part of any regression or new development. Currently it's happening due to sync call at `BookieRackAffinityMapping` which blocking-call I tried to fix in #22842 and #22843 . but this issue can be also mitigated if we would have multiple metadata-callback threads and I tried to simulate it in unit-case as well where unit-test case goes in deadlock and test fails with single metadata thread callback. > It would be useful to see which thread is holding that lock: > - parking to wait for <0x00001000f838ac60> (a java.util.concurrent.CompletableFuture$Signaller) no one is holding `0x00001000f838ac60` because that's where [BookieRackAffinityMapping](https://github.com/apache/pulsar/blob/branch-3.3/pulsar-broker-common/src/main/java/org/apache/pulsar/bookie/rackawareness/BookieRackAffinityMapping.java#L122) waiting for the `CompletableFuture` to be completed. Please find [full jstack for more info](https://gist.github.com/rdhabalia/f50c7b7b86a78427194603072d3b7666) ``` at jdk.internal.misc.Unsafe.park([email protected]/Native Method) - parking to wait for <0x00001000f838ac60> (a java.util.concurrent.CompletableFuture$Signaller) at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:194) at java.util.concurrent.CompletableFuture$Signaller.block([email protected]/CompletableFuture.java:1796) at java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3128) at java.util.concurrent.CompletableFuture.waitingGet([email protected]/CompletableFuture.java:1823) at java.util.concurrent.CompletableFuture.get([email protected]/CompletableFuture.java:1998) at org.apache.pulsar.bookie.rackawareness.BookieRackAffinityMapping.setConf(BookieRackAffinityMapping.java:121) ``` > In this case it looks just a deadlock in the BookieRackAffinityMapping code. The blocking of the metadata-callback thread is purely a side effect of the ManagedLedgerFactoryImpl map lock being held by the other thread. well, overall issue was:: metadata-store callback thread is blocked on topic creation, and other thread is blocked on different topic creation which is waiting on metadata-result's completableFuture and that will be only completed if metadata-store callback gets free. and that creates a deadlock. Please find below stacktrace for better understanding: 1. `metadata-store` thread was trying to create ledger and held a lock into `ManagedLedgerFactoryImpl` ``` "metadata-store-10-1" #25 prio=5 os_prio=0 cpu=7574.38ms elapsed=703.54s tid=0x00007fafdd884ed0 nid=0x37eb waiting for monitor entry [0x00007fae8f9d3000] java.lang.Thread.State: BLOCKED (on object monitor) at java.util.concurrent.ConcurrentHashMap.computeIfAbsent([email protected]/ConcurrentHashMap.java:1723) - waiting to lock <0x00001000f4a02bc0> (a java.util.concurrent.ConcurrentHashMap$ReservationNode) at org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.asyncOpen(ManagedLedgerFactoryImpl.java:365) ``` 2. At the same time, other thread `pulsar-io` was also trying to create a new topic which required `ManagedLedgerClientFactory` to create a new BK-Client and bk-client has a blocking call on metadatastore to get rack-info and waiting on CompletableFuture of metadatastore which is not being completed as metadatastore has only 1 thread and its waiting on step-1. ``` "pulsar-io-4-32" #166 prio=5 os_prio=0 cpu=429.42ms elapsed=701.77s tid=0x00007facc80314b0 nid=0x3881 waiting on condition [0x00007facc73f4000] java.lang.Thread.State: WAITING (parking) at jdk.internal.misc.Unsafe.park([email protected]/Native Method) - parking to wait for <0x00001000f838ac60> (a java.util.concurrent.CompletableFuture$Signaller) at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:194) at java.util.concurrent.CompletableFuture$Signaller.block([email protected]/CompletableFuture.java:1796) at java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3128) at java.util.concurrent.CompletableFuture.waitingGet([email protected]/CompletableFuture.java:1823) at java.util.concurrent.CompletableFuture.get([email protected]/CompletableFuture.java:1998) at org.apache.pulsar.bookie.rackawareness.BookieRackAffinityMapping.setConf(BookieRackAffinityMapping.java:121) - locked <0x00001000f8389698> (a org.apache.pulsar.bookie.rackawareness.BookieRackAffinityMapping) at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.initialize(RackawareEnsemblePlacementPolicyImpl.java:265) at org.apache.pulsar.bookie.rackawareness.IsolatedBookieEnsemblePlacementPolicy.initialize(IsolatedBookieEnsemblePlacementPolicy.java:105) at org.apache.pulsar.bookie.rackawareness.IsolatedBookieEnsemblePlacementPolicy.initialize(IsolatedBookieEnsemblePlacementPolicy.java:51) at org.apache.bookkeeper.client.BookKeeper.initializeEnsemblePlacementPolicy(BookKeeper.java:581) at org.apache.bookkeeper.client.BookKeeper.<init>(BookKeeper.java:505) at org.apache.bookkeeper.client.BookKeeper$Builder.build(BookKeeper.java:306) at org.apache.pulsar.broker.BookKeeperClientFactoryImpl.create(BookKeeperClientFactoryImpl.java:87) at org.apache.pulsar.broker.ManagedLedgerClientFactory.lambda$initialize$0(ManagedLedgerClientFactory.java:95) at org.apache.pulsar.broker.ManagedLedgerClientFactory$$Lambda$1006/0x00007facb41b1440.apply(Unknown Source) at java.util.concurrent.ConcurrentHashMap.computeIfAbsent([email protected]/ConcurrentHashMap.java:1705) - locked <0x00001000f3c03a28> (a java.util.concurrent.ConcurrentHashMap$ReservationNode) at org.apache.pulsar.broker.ManagedLedgerClientFactory.lambda$initialize$1(ManagedLedgerClientFactory.java:93) at org.apache.pulsar.broker.ManagedLedgerClientFactory$$Lambda$202/0x00007fae1e73e9b0.get(Unknown Source) at org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.lambda$asyncOpen$6(ManagedLedgerFactoryImpl.java:369) at org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl$$Lambda$660/0x00007facb91ecd60.apply(Unknown Source) at java.util.concurrent.ConcurrentHashMap.computeIfAbsent([email protected]/ConcurrentHashMap.java:1705) - locked <0x00001000f4a02bc0> (a java.util.concurrent.ConcurrentHashMap$ReservationNode) at org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.asyncOpen(ManagedLedgerFactoryImpl.java:365) ``` I have added unit-test which simulate the similar behavior. Our brokers were keep going in this deadlock situation and becoming irresponsive. Therefore, we deployed this quick fix and this issue didn't happen again. Therefore, I have created this fix to mitigate any future regression issue and I have also fixed mixing of sync/async call in other PRs. Attached stacktrace is from broker-2.10 but it happens with broker-3.3 as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
