[jira] [Commented] (ARTEMIS-2586) Inifinite Block in AMQ212054 after transient DB-error

Justin Bertram (Jira) Sat, 04 Jan 2020 06:09:38 -0800


    [ 
https://issues.apache.org/jira/browse/ARTEMIS-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008035#comment-17008035
 ]


Justin Bertram commented on ARTEMIS-2586:
-----------------------------------------

Are you acknowledging the message before you attempt to send it to the DLQ? If 
not, perhaps this is triggering this situation. For what it's worth the broker 
can handle sending messages to a DLQ so there's a possibility you can simplify 
your application here and eliminate this problem as well. You might also try 
eliminating producer flow control on the client by specifying 
{{producerWindowSize=-1}} on the client's URL.

Aside from that I think I'd need a reproducible test-case to investigate 
further.

bq. Just used "AMQP" as component since, the logged error has it in its name. 
You are right: we are using the core protocol.

>From what I can tell there's no mention of AMQP in any of the attachments. 
>Perhaps you're referring to message code "AMQ212054" which has the "AMQ" 
>prefix (which is a 3-letter code referencing ActiveMQ)?

> Inifinite Block in AMQ212054 after transient DB-error
> -----------------------------------------------------
>
>                 Key: ARTEMIS-2586
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2586
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: AMQP
>    Affects Versions: 2.10.1
>         Environment: This is Ubuntu 18.04 and Oracle DB, but don't think it's 
> that relevant for the issue.
>            Reporter: Rico Neubauer
>            Priority: Major
>         Attachments: 2019-11-28_threaddump_01.txt, 
> 2019-12-04_threaddump_01.txt, Message-Counts.png, artemis.xml, 
> initial-error.txt, log-extract.txt, writerIndex-Credits.PNG
>
>
> Hi,
> Would like to describe a quite severe situation which was expirienced in a 
> long-running test with 2 out of 3 instances/machines.
> We are running Karaf with Artemis 2.10.1.
> After some time (see screenshot), first one, then after a while a 2nd 
> instance came to a complete stop.
> Looking into the logs and thread-dumps revealed the following (same for bith 
> instances):
>  # There was a temporary problem connecting to the DB ({{connection reset by 
> peer}}and {{Closed Connection }})
>  # This resulted (due to handling on our side) in an 
> {{IllegalStateException}}/{{Error during two phase commit}} being thrown back 
> to Artemis.
>  # After this, there is no messaging possible anymore at all and the 
> following log repeats:
> {noformat}
> AMQ212054: Destination address=DLQ is blocked. If the system is configured to 
> block make sure you consume messages on this configuration.{noformat}
> (system is not configured to block, see attached config)
>  which comes from threads like these, trying to obtain credits for sending:
>  
> {noformat}
> "Thread-93 (ActiveMQ-client-global-threads)" Id=2001 in TIMED_WAITING on 
> lock=java.util.concurrent.Semaphore$NonfairSync@1f9a57e0
>  at sun.misc.Unsafe.park(Native Method)
>  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1332)
>  at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:582)
>  at 
> org.apache.activemq.artemis.core.client.impl.ClientProducerCreditsImpl.actualAcquire(ClientProducerCreditsImpl.java:73)
>  at 
> org.apache.activemq.artemis.core.client.impl.AbstractProducerCreditsImpl.acquireCredits(AbstractProducerCreditsImpl.java:77)
>  at 
> org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.sendRegularMessage(ClientProducerImpl.java:301)
>  at 
> org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.doSend(ClientProducerImpl.java:275)
>  at 
> org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:128)
>  at 
> org.apache.activemq.artemis.jms.client.ActiveMQMessageProducer.doSendx(ActiveMQMessageProducer.java:485)
>  at 
> org.apache.activemq.artemis.jms.client.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:195)
>  at 
> com.seeburger.engine.jms.MessageReceiverBase.sendToDLQ(MessageReceiverBase.java:571)
>  at 
> com.seeburger.engine.jms.MessageReceiverBase.handleException(MessageReceiverBase.java:493)
>  at 
> com.seeburger.engine.jms.MessageReceiverBase.onMessage(MessageReceiverBase.java:387)
>  at 
> org.apache.activemq.artemis.jms.client.JMSMessageListenerWrapper.onMessage(JMSMessageListenerWrapper.java:110)
>  at 
> org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.callOnMessage(ClientConsumerImpl.java:1031)
>  at 
> org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl.access$400(ClientConsumerImpl.java:50)
>  at 
> org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl$Runner.run(ClientConsumerImpl.java:1154)
>  at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42)
>  at 
> org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
>  at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:66)
>  at 
> org.apache.activemq.artemis.utils.actors.ProcessorBase$$Lambda$431/1769898766.run(Unknown
>  Source)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at 
> org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
> Locked synchronizers: count = 1
>  - java.util.concurrent.ThreadPoolExecutor$Worker@bc49fcf
> {noformat}
> which will never succeed, since the credits seem to no suffice (see heap-dump 
> screenshot)
> From my point of view, the thrown IllegalStateException should not lead to 
> the system going in this non-recoverable state, what do you think, is there 
> something that can be enhanced?
>  
> [Fastthread-Link|https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjAvMDEvMy8tLTIwMTktMTItMDRfdGhyZWFkZHVtcF8wMS50eHQtLTEzLTM4LTE1OzstLTIwMTktMTEtMjhfdGhyZWFkZHVtcF8wMS50eHQtLTEzLTM4LTE1]
> In case it helps: The 2 instances are still in this state (since September) 
> and I can fetch additional information or debug them on request.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARTEMIS-2586) Inifinite Block in AMQ212054 after transient DB-error

Reply via email to