Christian Danner created ARTEMIS-3264:
-----------------------------------------

             Summary: Core to AMQP conversion error causes client disconnect
                 Key: ARTEMIS-3264
                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3264
             Project: ActiveMQ Artemis
          Issue Type: Bug
          Components: AMQP, Broker
    Affects Versions: 2.17.0
         Environment: Embedded Apache Artemis 2.17.0
Windows Server 2016 Standard (10.0.14393)
Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)
            Reporter: Christian Danner
         Attachments: activemq_artemis.log

We are deploying a mesh of embedded brokers and per default use core bridges to 
replicate data between different broker instances / topics.

The clients that actually consume messages are connected using AMQP (QPID, AMQP 
.Net Lite)

Recently we encountered a situation where the broker could not deliver a 
message to a (Java QPID) client because the internal conversion from Core to 
AMQP failed (see attached log file).

This had the effect that the client got disconnected and did not receive any 
messages anymore at all (it was stuck in a JMS receive call and obviously was 
not informed about disconnect - not sure if this is a QPID/Proton issue, but 
even after restart the client was not able to connect anymore to the server! We 
had to restart the server to be able to connect again!)

We are currently working around this issue by using AMQP (i.e. JMS) as the only 
client side protocol to avoid that Core-AMQP conversion happens in the first 
place.

However, I'm wondering if the way the broker deals with such errors is a good 
idea - it disconnects the client and keeps the message in the queue, so even 
after reconnect the delivery fails again with the same Exception!

Looking at the call stack (ending up in QueueImpl:3800) this kind of error is 
handled in a very generic way - the handler method does not distinguish between 
different types of Exceptions and knows nothing about the reason why delivery 
failed, however it still defaults to disconnecting the corresponding client.

I think in the situation described above it would be necessary to forward the 
erroneous message to a DLQ instead and continue with the next message. 
Currently the message clogs the queue and needs to be deleted / moved manually 
in order for processing to continue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to