[
https://issues.apache.org/jira/browse/ARTEMIS-3037?focusedWorklogId=525491&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-525491
]
ASF GitHub Bot logged work on ARTEMIS-3037:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 17/Dec/20 11:03
Start Date: 17/Dec/20 11:03
Worklog Time Spent: 10m
Work Description: TomasHofman opened a new pull request #3385:
URL: https://github.com/apache/activemq-artemis/pull/3385
…e a thread hanging in WAITING state
Issue: https://issues.apache.org/jira/browse/ARTEMIS-3037
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 525491)
Remaining Estimate: 0h
Time Spent: 10m
> JournalImpl#checkKnownRecordID() implementation can leave a thread hanging in
> WAITING state
> -------------------------------------------------------------------------------------------
>
> Key: ARTEMIS-3037
> URL: https://issues.apache.org/jira/browse/ARTEMIS-3037
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Components: Broker
> Affects Versions: 2.9.0, 2.16.0
> Reporter: Tomas Hofman
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The {{JournalImpl#checkKnownRecordID()}} implementation contains following
> code:
> {code}
> final SimpleFuture<Boolean> known = new SimpleFutureImpl<>();
> // retry on the append thread. maybe the appender thread is not keeping
> up.
> appendExecutor.execute(new Runnable() {
> @Override
> public void run() {
> journalLock.readLock().lock();
> try {
> known.set(records.containsKey(id)
> || pendingRecords.contains(id)
> || (compactor != null && compactor.containsRecord(id)));
> } finally {
> journalLock.readLock().unlock();
> }
> }
> });
> if (!known.get()) {
> ...
> }
> {code}
> If the code in the Runnable fails with exception before the {{known}} future
> value is set, the main thread would be left in the WAITING state forever.
> Exception handling should be added that would cancel the future in case of
> exception.
> We've observed cases where following threads were left hanging, while no
> other threads operating inside JournalImpl were present. I believe that
> {{JournalImpl#checkKnownRecordID()}} implementation may be responsible for
> that:
> {code}
> "Thread-16
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@423fe5c3)"
> #1078 prio=5 os_prio=64 tid=0x000000011c34a000 nid=0x4eb waiting on
> condition [0xfffffffabe9ad000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0xfffffffbe73c29e8> (a
> java.util.concurrent.CountDownLatch$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> at
> org.apache.activemq.artemis.utils.SimpleFutureImpl.get(SimpleFutureImpl.java:62)
> at
> org.apache.activemq.artemis.core.journal.impl.JournalImpl.checkKnownRecordID(JournalImpl.java:1080)
> at
> org.apache.activemq.artemis.core.journal.impl.JournalImpl.appendDeleteRecord(JournalImpl.java:950)
> at
> org.apache.activemq.artemis.core.persistence.impl.journal.AbstractJournalStorageManager.confirmPendingLargeMessage(AbstractJournalStorageManager.java:361)
> at
> org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.confirmLargeMessageSend(PostOfficeImpl.java:1390)
> - locked <0xfffffffbe73aa1b0> (a
> org.apache.activemq.artemis.core.persistence.impl.journal.LargeServerMessageImpl)
> at
> org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.processRoute(PostOfficeImpl.java:1336)
> at
> org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.route(PostOfficeImpl.java:980)
> at
> org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.route(PostOfficeImpl.java:871)
> at
> org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.doSend(ServerSessionImpl.java:2045)
> - locked <0xfffffffb19447fb8> (a
> org.apache.activemq.artemis.core.server.impl.ServerSessionImpl)
> at
> org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.doSend(ServerSessionImpl.java:1989)
> - locked <0xfffffffb19447fb8> (a
> org.apache.activemq.artemis.core.server.impl.ServerSessionImpl)
> at
> org.apache.activemq.artemis.core.protocol.core.ServerSessionPacketHandler.sendContinuations(ServerSessionPacketHandler.java:1034)
> - locked <0xfffffffb1962b900> (a java.lang.Object)
> at
> org.apache.activemq.artemis.core.protocol.core.ServerSessionPacketHandler.slowPacketHandler(ServerSessionPacketHandler.java:312)
> at
> org.apache.activemq.artemis.core.protocol.core.ServerSessionPacketHandler.onMessagePacket(ServerSessionPacketHandler.java:285)
> at
> org.apache.activemq.artemis.core.protocol.core.ServerSessionPacketHandler$$Lambda$651/2097400985.onMessage(Unknown
> Source)
> at
> org.apache.activemq.artemis.utils.actors.Actor.doTask(Actor.java:33)
> at
> org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
> at
> org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$413/494003142.run(Unknown
> Source)
> at
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
> at
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
> at
> org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
> at
> org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$413/494003142.run(Unknown
> Source)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java)
> at
> org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
> Locked ownable synchronizers:
> - <0xfffffffba1800ca0> (a
> java.util.concurrent.ThreadPoolExecutor$Worker)
> {code}
> {code}
> "Thread-82
> (ActiveMQ-IO-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$7@3bde9e44)"
> #2130 prio=5 os_prio=64 tid=0x000000017b6df800 nid=0x907 waiting for monitor
> entry [0xffffffff045de000]
> java.lang.Thread.State: BLOCKED (on object monitor)
> at
> org.apache.activemq.artemis.core.persistence.impl.journal.LargeServerMessageImpl.getEncodeSize(LargeServerMessageImpl.java:178)
> - waiting to lock <0xfffffffbe73aa1b0> (a
> org.apache.activemq.artemis.core.persistence.impl.journal.LargeServerMessageImpl)
> at
> org.apache.activemq.artemis.core.persistence.impl.journal.codec.LargeMessagePersister.getEncodeSize(LargeMessagePersister.java:59)
> at
> org.apache.activemq.artemis.core.persistence.impl.journal.codec.LargeMessagePersister.getEncodeSize(LargeMessagePersister.java:25)
> at
> org.apache.activemq.artemis.core.journal.impl.dataformat.JournalAddRecord.getEncodeSize(JournalAddRecord.java:79)
> at
> org.apache.activemq.artemis.core.journal.impl.JournalImpl.appendRecord(JournalImpl.java:2792)
> at
> org.apache.activemq.artemis.core.journal.impl.JournalImpl.access$100(JournalImpl.java:91)
> at
> org.apache.activemq.artemis.core.journal.impl.JournalImpl$1.run(JournalImpl.java:850)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)