Tomas Hofman created ARTEMIS-3037:
-------------------------------------
Summary: JournalImpl#checkKnownRecordID() implementation can leave
a thread hanging in WAITING state
Key: ARTEMIS-3037
URL: https://issues.apache.org/jira/browse/ARTEMIS-3037
Project: ActiveMQ Artemis
Issue Type: Bug
Components: Broker
Affects Versions: 2.16.0, 2.9.0
Reporter: Tomas Hofman
The {{JournalImpl#checkKnownRecordID()}} implementation contains following code:
{code}
final SimpleFuture<Boolean> known = new SimpleFutureImpl<>();
// retry on the append thread. maybe the appender thread is not keeping
up.
appendExecutor.execute(new Runnable() {
@Override
public void run() {
journalLock.readLock().lock();
try {
known.set(records.containsKey(id)
|| pendingRecords.contains(id)
|| (compactor != null && compactor.containsRecord(id)));
} finally {
journalLock.readLock().unlock();
}
}
});
if (!known.get()) {
...
}
{code}
If the code in the Runnable fails with exception before the {{known}} future
value is set, the main thread would be left in the WAITING state forever.
Exception handling should be added that would cancel the future in case of
exception.
We've observed cases where following threads were left hanging, while no other
threads operating inside JournalImpl were present. I believe that
{{JournalImpl#checkKnownRecordID()}} implementation may be responsible for that:
{code}
"Thread-16
(ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@423fe5c3)"
#1078 prio=5 os_prio=64 tid=0x000000011c34a000 nid=0x4eb waiting on condition
[0xfffffffabe9ad000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0xfffffffbe73c29e8> (a
java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
at
org.apache.activemq.artemis.utils.SimpleFutureImpl.get(SimpleFutureImpl.java:62)
at
org.apache.activemq.artemis.core.journal.impl.JournalImpl.checkKnownRecordID(JournalImpl.java:1080)
at
org.apache.activemq.artemis.core.journal.impl.JournalImpl.appendDeleteRecord(JournalImpl.java:950)
at
org.apache.activemq.artemis.core.persistence.impl.journal.AbstractJournalStorageManager.confirmPendingLargeMessage(AbstractJournalStorageManager.java:361)
at
org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.confirmLargeMessageSend(PostOfficeImpl.java:1390)
- locked <0xfffffffbe73aa1b0> (a
org.apache.activemq.artemis.core.persistence.impl.journal.LargeServerMessageImpl)
at
org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.processRoute(PostOfficeImpl.java:1336)
at
org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.route(PostOfficeImpl.java:980)
at
org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.route(PostOfficeImpl.java:871)
at
org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.doSend(ServerSessionImpl.java:2045)
- locked <0xfffffffb19447fb8> (a
org.apache.activemq.artemis.core.server.impl.ServerSessionImpl)
at
org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.doSend(ServerSessionImpl.java:1989)
- locked <0xfffffffb19447fb8> (a
org.apache.activemq.artemis.core.server.impl.ServerSessionImpl)
at
org.apache.activemq.artemis.core.protocol.core.ServerSessionPacketHandler.sendContinuations(ServerSessionPacketHandler.java:1034)
- locked <0xfffffffb1962b900> (a java.lang.Object)
at
org.apache.activemq.artemis.core.protocol.core.ServerSessionPacketHandler.slowPacketHandler(ServerSessionPacketHandler.java:312)
at
org.apache.activemq.artemis.core.protocol.core.ServerSessionPacketHandler.onMessagePacket(ServerSessionPacketHandler.java:285)
at
org.apache.activemq.artemis.core.protocol.core.ServerSessionPacketHandler$$Lambda$651/2097400985.onMessage(Unknown
Source)
at org.apache.activemq.artemis.utils.actors.Actor.doTask(Actor.java:33)
at
org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
at
org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$413/494003142.run(Unknown
Source)
at
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
at
org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
at
org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
at
org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$413/494003142.run(Unknown
Source)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java)
at
org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
Locked ownable synchronizers:
- <0xfffffffba1800ca0> (a
java.util.concurrent.ThreadPoolExecutor$Worker)
{code}
{code}
"Thread-82
(ActiveMQ-IO-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$7@3bde9e44)"
#2130 prio=5 os_prio=64 tid=0x000000017b6df800 nid=0x907 waiting for monitor
entry [0xffffffff045de000]
java.lang.Thread.State: BLOCKED (on object monitor)
at
org.apache.activemq.artemis.core.persistence.impl.journal.LargeServerMessageImpl.getEncodeSize(LargeServerMessageImpl.java:178)
- waiting to lock <0xfffffffbe73aa1b0> (a
org.apache.activemq.artemis.core.persistence.impl.journal.LargeServerMessageImpl)
at
org.apache.activemq.artemis.core.persistence.impl.journal.codec.LargeMessagePersister.getEncodeSize(LargeMessagePersister.java:59)
at
org.apache.activemq.artemis.core.persistence.impl.journal.codec.LargeMessagePersister.getEncodeSize(LargeMessagePersister.java:25)
at
org.apache.activemq.artemis.core.journal.impl.dataformat.JournalAddRecord.getEncodeSize(JournalAddRecord.java:79)
at
org.apache.activemq.artemis.core.journal.impl.JournalImpl.appendRecord(JournalImpl.java:2792)
at
org.apache.activemq.artemis.core.journal.impl.JournalImpl.access$100(JournalImpl.java:91)
at
org.apache.activemq.artemis.core.journal.impl.JournalImpl$1.run(JournalImpl.java:850)
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)