codelipenghui opened a new pull request #13744:
URL: https://github.com/apache/pulsar/pull/13744
### Motivation
Fixes: #13736, the deadlock when using ZK thread to create ledger.
The details of the deadlock:
```
Found one Java-level deadlock:
=============================
"ZKC-connect-executor-0-SendThread(9.142.172.233:2181)":
waiting to lock monitor 0x00007fa3fc033bd8 (object 0x00007fad5e804158, a
org.apache.zookeeper.ZooKeeper$States),
which is held by "PullMessageThread_167"
"PullMessageThread_167":
waiting to lock monitor 0x00007fbe40026668 (object 0x00007fb596023c08, a
java.util.concurrent.LinkedBlockingQueue),
which is held by "ZKC-connect-executor-0-SendThread(9.142.172.233:2181)"
Java stack information for the threads listed above:
===================================================
"ZKC-connect-executor-0-SendThread(9.142.172.233:2181)":
at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1678)
- waiting to lock <0x00007fad5e804158> (a
org.apache.zookeeper.ZooKeeper$States)
at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1649)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:1905)
at
org.apache.bookkeeper.zookeeper.ZooKeeperClient$10.zkRun(ZooKeeperClient.java:752)
at
org.apache.bookkeeper.zookeeper.ZooKeeperClient$ZkRetryRunnable.run(ZooKeeperClient.java:392)
at
org.apache.bookkeeper.zookeeper.ZooKeeperClient.create(ZooKeeperClient.java:762)
at
org.apache.bookkeeper.util.ZkUtils.asyncCreateFullPathOptimistic(ZkUtils.java:75)
at
org.apache.bookkeeper.meta.ZkLedgerIdGenerator.generateLedgerIdImpl(ZkLedgerIdGenerator.java:78)
at
org.apache.bookkeeper.meta.ZkLedgerIdGenerator.generateLedgerId(ZkLedgerIdGenerator.java:73)
at
org.apache.bookkeeper.meta.LongZkLedgerIdGenerator.generateLedgerId(LongZkLedgerIdGenerator.java:301)
at
org.apache.bookkeeper.client.LedgerCreateOp.generateLedgerIdAndCreateLedger(LedgerCreateOp.java:194)
at
org.apache.bookkeeper.client.LedgerCreateOp.initiate(LedgerCreateOp.java:182)
at
org.apache.bookkeeper.client.BookKeeper.asyncCreateLedger(BookKeeper.java:860)
at
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.asyncCreateLedger(ManagedLedgerImpl.java:3645)
at
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.createLedgerAfterClosed(ManagedLedgerImpl.java:1596)
- locked <0x00007fadbdfa7b90> (a
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl)
at
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.ledgerClosed(ManagedLedgerImpl.java:1587)
- locked <0x00007fadbdfa7b90> (a
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl)
at
org.apache.bookkeeper.mledger.impl.OpAddEntry.closeComplete(OpAddEntry.java:236)
at
org.apache.bookkeeper.client.LedgerHandle$5.lambda$safeRun$0(LedgerHandle.java:552)
at
org.apache.bookkeeper.client.LedgerHandle$5$$Lambda$1557/731657311.accept(Unknown
Source)
at
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at
org.apache.bookkeeper.client.LedgerHandle$5.lambda$safeRun$3(LedgerHandle.java:614)
at
org.apache.bookkeeper.client.LedgerHandle$5$$Lambda$1563/295431727.accept(Unknown
Source)
at
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at
org.apache.bookkeeper.client.MetadataUpdateLoop.lambda$writeLoop$1(MetadataUpdateLoop.java:161)
at
org.apache.bookkeeper.client.MetadataUpdateLoop$$Lambda$1562/765900306.accept(Unknown
Source)
at
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
at
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at
org.apache.bookkeeper.meta.AbstractZkLedgerManager$4.processResult(AbstractZkLedgerManager.java:508)
at
org.apache.bookkeeper.zookeeper.ZooKeeperClient$22$1.processResult(ZooKeeperClient.java:1094)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:638)
at
org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:541)
- locked <0x00007fb596023c08> (a
java.util.concurrent.LinkedBlockingQueue)
at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:781)
at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:818)
at org.apache.zookeeper.ClientCnxn.access$2600(ClientCnxn.java:106)
at
org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1403)
at
org.apache.zookeeper.ClientCnxn$SendThread.cleanAndNotifyState(ClientCnxn.java:1331)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1309)
"PullMessageThread_167":
at
org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:537)
- waiting to lock <0x00007fb596023c08> (a
java.util.concurrent.LinkedBlockingQueue)
at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:781)
at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:818)
at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1680)
- locked <0x00007fad5e804158> (a org.apache.zookeeper.ZooKeeper$States)
at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1649)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:2411)
at
org.apache.bookkeeper.zookeeper.ZooKeeperClient$19.zkRun(ZooKeeperClient.java:1009)
at
org.apache.bookkeeper.zookeeper.ZooKeeperClient$ZkRetryRunnable.run(ZooKeeperClient.java:392)
at
org.apache.bookkeeper.zookeeper.ZooKeeperClient.getData(ZooKeeperClient.java:1019)
at
org.apache.bookkeeper.meta.AbstractZkLedgerManager.readLedgerMetadata(AbstractZkLedgerManager.java:435)
at
org.apache.bookkeeper.meta.AbstractZkLedgerManager.readLedgerMetadata(AbstractZkLedgerManager.java:430)
at
org.apache.bookkeeper.meta.CleanupLedgerManager.readLedgerMetadata(CleanupLedgerManager.java:157)
at
org.apache.bookkeeper.client.LedgerOpenOp.initiate(LedgerOpenOp.java:114)
at
org.apache.bookkeeper.client.LedgerOpenOp$OpenBuilderImpl.open(LedgerOpenOp.java:269)
at
org.apache.bookkeeper.client.LedgerOpenOp$OpenBuilderImpl.execute(LedgerOpenOp.java:247)
at
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.lambda$getLedgerHandle$20(ManagedLedgerImpl.java:1773)
at
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl$$Lambda$1195/876838597.apply(Unknown
Source)
at
org.apache.pulsar.common.util.collections.ConcurrentLongHashMap$Section.put(ConcurrentLongHashMap.java:287)
at
org.apache.pulsar.common.util.collections.ConcurrentLongHashMap.computeIfAbsent(ConcurrentLongHashMap.java:135)
at
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.getLedgerHandle(ManagedLedgerImpl.java:1742)
at
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.asyncReadEntry(ManagedLedgerImpl.java:1839)
at
org.apache.bookkeeper.mledger.impl.OpFindNewest.find(OpFindNewest.java:147)
at
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.asyncFindPosition(ManagedLedgerImpl.java:1670)
at
org.streamnative.pulsar.handlers.rocketmq.utils.MessageIdUtils.getPositionForOffset(MessageIdUtils.java:183)
at
org.streamnative.pulsar.handlers.rocketmq.inner.RopServerCnx.getMessage(RopServerCnx.java:692)
at
org.streamnative.pulsar.handlers.rocketmq.inner.processor.PullMessageProcessor.processRequest(PullMessageProcessor.java:359)
at
org.streamnative.pulsar.handlers.rocketmq.inner.processor.PullMessageProcessor.processRequest(PullMessageProcessor.java:170)
at
org.streamnative.pulsar.handlers.rocketmq.inner.proxy.RopBrokerProxy$PullMessageProcessorProxy.processRequest(RopBrokerProxy.java:1098)
at
org.streamnative.pulsar.handlers.rocketmq.inner.NettyRemotingAbstract$1.run(NettyRemotingAbstract.java:202)
at
org.apache.rocketmq.remoting.netty.RequestTask.run(RequestTask.java:80)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Found 1 deadlock.
```
### Modification
Use the executor of the managed ledger to create the ledger to avoid the
deadlock.
<!--
### Contribution Checklist
- Name the pull request in the form "[Issue XYZ][component] Title of the
pull request", where *XYZ* should be replaced by the actual issue number.
Skip *Issue XYZ* if there is no associated github issue for this pull
request.
Skip *component* if you are unsure about which is the best component.
E.g. `[docs] Fix typo in produce method`.
- Fill out the template below to describe the changes contributed by the
pull request. That will give reviewers the context they need to do the review.
- Each pull request should address only one issue, not mix up code from
multiple issues.
- Each commit in the pull request has a meaningful commit message
- Once all items of the checklist are addressed, remove the above text and
this checklist, leaving only the filled out template below.
**(The sections below can be removed for hotfixes of typos)**
-->
*(If this PR fixes a github issue, please add `Fixes #<xyz>`.)*
Fixes #<xyz>
*(or if this PR is one task of a github issue, please add `Master Issue:
#<xyz>` to link to the master issue.)*
Master Issue: #<xyz>
### Motivation
*Explain here the context, and why you're making that change. What is the
problem you're trying to solve.*
### Modifications
*Describe the modifications you've done.*
### Documentation
Check the box below or label this PR directly (if you have committer
privilege).
Need to update docs?
- [ ] `doc-required`
(If you need help on updating docs, create a doc issue)
- [x] `no-need-doc`
(Please explain why)
- [ ] `doc`
(If this PR contains doc changes)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]