[jira] [Commented] (ARTEMIS-3260) Already consumed messages are redelivered after server restart (possible Journal corruption)

Christian Danner (Jira) Fri, 23 Apr 2021 13:26:06 -0700


    [ 
https://issues.apache.org/jira/browse/ARTEMIS-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17331015#comment-17331015
 ]


Christian Danner commented on ARTEMIS-3260:
-------------------------------------------

Thanks for looking into this issue so quickly! Attached is a small example 
client program that I distilled from our current running configuration. 
Basically we have various components initializing the JMS connection, session 
and consumers / producers based on proprietary configuration files - all 
resources are persistent and only recreated in case non-recoverable Exceptions 
occur.

The attached sample program represents the resulting client configuration that 
we are currently using and shows our overall message processing scheme (which 
is in reality a little bit more complex, but the example shows all basic steps 
that are involved)

It may also be important to note, that each broker connection is shared by 
multiple threads, however each thread is guaranteed to use a dedicated session 
and each queue is only consumed by at most one thread / session. However,  
consumer threads may publish messages as well, i.e. we forward messages to 
success / error addresses based on the outcome of our message handling logic.

In case an exception occurs for wich we wish to handle the message again we 
rollback the transaction, otherwise it is committed, which means that the 
message was processed and can still be viewed / reprocessed via the 
corresponding success / error queue.

The success and error queues have an expiration time set so messages published 
to those addresses are never consumed by our application, they simply expire.

 

Concerning the restarts: we use a custom Spring based application container for 
which we always perform a graceful shutdown - all components are stopped in an 
orderly manner including the embedded broker instance which is stopped via 
ActiveMQServer.stop(true)

> Already consumed messages are redelivered after server restart (possible 
> Journal corruption)
> --------------------------------------------------------------------------------------------
>
>                 Key: ARTEMIS-3260
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3260
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: AMQP, Broker
>    Affects Versions: 2.17.0
>         Environment: Embedded Apache Artemis 2.17.0
> Windows Server 2016 Standard (10.0.14393)
> Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
> Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)
>            Reporter: Christian Danner
>            Priority: Blocker
>         Attachments: ExampleClient.java, PagingStoreImpl.java, 
> QueueImpl.java, activemq_artemis.log, broker.xml, broker.zip, 
> restart_log_1_no_recovery-2021-04-22_08.46.52.log, 
> restart_log_2_recovery-2021-04-22_08.48.49.log
>
>
> After upgrading from Artemis 2.15.0 to 2.17.0 we are experiencing situations 
> (non-deterministic but quite regular) where the embedded Apache Artemis 
> instance re-delivers messages to a client again after a server restart.
> Those messages were already processed successfully before restart and the web 
> console shows a message count of 0 prior to restarting the process.
> It is also important to note once those stuck messages (which seemingly 
> appear from out of nowhere) have been reprocessed newly added messages are 
> processed just fine - so what we are seeing is the following:
>  # At some point in time messages A,B,C were routed to Queue Q and 
> successfully consumed
>  # Q is empty (web console)
>  # perform broker restart
>  # client consumes A,B,C from Q again
>  # Q is empty (web console)
>  # another client sends X,Y,Z to Q
>  # client consumes X,Y,Z
>  # Q is empty (web console)
>  # perform broker restart
>  # client consumes A,B,C from Q again!
> This happens again and again on each boker restart up to a point where the 
> broker finally manages to recover from this situation by detecting an invalid 
> (negative) address size as indicated by the following log output:
> {quote}2021-04-22 21:04:51.897 WARN 
> org.apache.activemq.artemis.core.paging.impl.PagingStoreImpl.addSize(PagingStoreImpl.java:753)
>  [Thread-1 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@26bb92e2)]
> {quote}
> {quote}[ARTEMIS] AMQ222214: Destination incoming.message has an inconsistent 
> and negative address size=-3,379.
> {quote}
> {quote}2021-04-22 21:04:51.897 WARN 
> org.apache.activemq.artemis.core.paging.impl.PagingStoreImpl.addSize(PagingStoreImpl.java:753)
>  [Thread-1 
> (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@26bb92e2)]
> {quote}
> {quote}[ARTEMIS] AMQ222214: Destination incoming.message has an inconsistent 
> and negative address size=-3,451.
> {quote}
>  
> The full log file of such a situation (where the broker managed to recover) 
> is attached together with the broker.xml file that we use as a template to 
> configure the embedded instance programmatically.
> The broker runs embedded with the client which consumes messages via AMQP 
> using the Apache QPID library (JMS2.0 - v0.57.0) - there is only a single 
> Thread ever consuming from a queue and we use transactions to explicitly 
> commit or rollback received messages with prefetch disabled 
> (jms.prefetchPolicy.all=0)
> Further investigation / debugging has shown that when messages are 
> redelivered the above log outputs concerning the negative address size are 
> absent and the reason is that the value returned by 
> messageReference.getMessageMemoryEstimate() is different for the exact same 
> message in line 1022 of class 
> org.apache.activemq.artemis.core.server.impl.QueueImpl.
> This difference stems from a different value being calculated in 
> AMQPStandardMessage class (getMemoryEstimate()) and the difference is equal 
> to the value returned by 
> unmarshalledApplicationPropertiesMemoryEstimateFromData() so I assume that 
> the applicationProperties are sometimes not being considered (I still have to 
> verify this).
> I can also provide the complete broker journal for such a situation which we 
> currently use for debugging if this helps to analyze the issue (approx. ~25MB 
> of files, compressed ~100kB)
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARTEMIS-3260) Already consumed messages are redelivered after server restart (possible Journal corruption)

Reply via email to