[
https://issues.apache.org/jira/browse/AMQ-9653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ephemeris Lappis updated AMQ-9653:
----------------------------------
Attachment: BP3.xml
> ActiveMQ doesn't process and deliver messages into/from queues in right order
> -----------------------------------------------------------------------------
>
> Key: AMQ-9653
> URL: https://issues.apache.org/jira/browse/AMQ-9653
> Project: ActiveMQ Classic
> Issue Type: Bug
> Components: Broker
> Affects Versions: 5.18.3
> Environment: ActiveMQ is executed as a pod in Kubernetes clusters,
> with a OpenJDK 64-Bit Server VM version 17.0.9+9 (Eclipse Adoptium).
> The persistance DB is on a VM (not a container)
> The Karaf has the same configuration in the same cluster namespace.
> All in Linux images 5.15.0-116-generic #126-Ubuntu
> Reporter: Ephemeris Lappis
> Priority: Major
> Attachments: BP3.xml
>
>
> After some load tests we can see random cases of messages that are delivered
> in an unexpected order. I our case this breaks the application for which the
> processing order is mandatory.
> FYI, we use a JDBC (PostgreSQL) persistance, although it may not be directly
> concerned.
> We've set some hacks to try to analyze where and when this occurs. First of
> them is a trigger on the "activemq_msgs" table to insert all the messages in
> a mirror table to keep them. Then we've substituted the application's queue
> consumer by a small Camel route that consumes messages and checks their
> logical orders. When a message arrives in a wrong sequence, a log is written
> in Karaf, and the message is copied into another backup queue. We also have
> the logs of the Camel routes that produce the messages that gives the origin
> chronology.
> Here an example where 5 messages constitute the "conversation". The table
> below (I put it at the end of the text) is built from the message mirror
> table. Its columns are :
> - the "id" (primary key) generated by ActiveMQ : this gives a first idea
> about message ordering.
> - the "ie_id" is our business identifier for a given batch of messages.
> - the "correlation_id" has its name says, a ID that may correlate different
> conversations, and is shared all over the application system.
> - the "etape", is the business step. I'll give more details about that later.
> - "date_traitement" is the current date that is set by the producer just
> before the message is sent.
> - "jms_timestamp_ms" and "jms_timestamp" are the JMSTimestamp of the message.
> - the "commit_time" is the pgsql transaction time of the inserted row.
> The logical order of our steps in almost all cases of conversation is :
> - 1 "initier_echange"
> - 1 "initier_utr" for each sub-batch (Logical transactional unit – TU)
> - 0 or more "diffuser_donnees_utr" for each TU (one for every generated
> data).
> - 1 "diffuser_utr" that closes the sub-batch.
> Our most common use cases manage only one UT with 1 or 2 data.
> According to the table, we can see that the "diffuser_utr" has been processed
> by ActiveMQ before the last "diffuser_donnees_utr". This last message seems
> to have been temporarly lost for a while : it's been persisted, and then
> delivered a very long time after it's been sent. See that the
> "date_traitement" is right, but both the "id" of the row and its commit time
> do not match. The JMS timestamps also seem inconsistent, first comparing with
> the "date_traitement", and then with a same value for the 4 last messages,
> even if the last one has a quite delay before it's been inserted and
> dispatched. Finally, the last message has been received by the consumer at
> 2025-01-23T15:45:39,236, thus the thrown error.
> This is just an example. In some other cases the very first message is
> delivered after all the others.
> On the broker side, no log about any strange behavior. The system and JVM
> monitoring do not show any resource lack issue.
> FYI, our load tests are done with a JMeter injector that set a 20 threads
> group, each one sending messages to 16 queues with loops from 10 to 100. The
> 16 queues are processed by 16 Camel routes that indirectly produce 4 or 5
> messages into the "buggy queue" for every received input. In the application,
> a single application with a single consumer processes them. The same in our
> testing Camel consumer.
> As the broker is some kind of black box for us, we can't give more
> information about how it manages this kind of load test that is more or less
> what could happen in a real rush situation.
> Of course I can provide us with any detail you could ask for. I've not found
> any interesting case googling for similar issues, but perhaps someone already
> had some.
> {{Thanks in advance.}}
> {noformat}
> |id |ie_id |correlation_id
> |etape |date_traitement
> |jms_timestamp_ms|jms_timestamp |commit_time |
> |-----|----------------------------------|----------------------------------|------------------|----------------------|--------------|---------------------------|----------------------------|
> |8 124
> |5009d4f3-dfac-4de2-8360-32999ee6aac7|6b371cdc-7071-4dfa-8c58-231770df4f8e|initier_echange
> |2025-01-23T14:45:15.963Z|1737643516108 |2025-01-23 15:45:16.108
> +0100|2025-01-23 15:45:15.964 +0100|
> |8 142
> |5009d4f3-dfac-4de2-8360-32999ee6aac7|6b371cdc-7071-4dfa-8c58-231770df4f8e|initier_utr
> |2025-01-23T14:45:16.003Z|1737643515969 |2025-01-23 15:45:15.969
> +0100|2025-01-23 15:45:16.005 +0100|
> |8 165
> |5009d4f3-dfac-4de2-8360-32999ee6aac7|6b371cdc-7071-4dfa-8c58-231770df4f8e|diffuser_donnees_utr|2025-01-23T14:45:16.062Z|1737643515969
> |2025-01-23 15:45:15.969 +0100|2025-01-23 15:45:16.064 +0100|
> |8 166
> |5009d4f3-dfac-4de2-8360-32999ee6aac7|6b371cdc-7071-4dfa-8c58-231770df4f8e|diffuser_utr
> |2025-01-23T14:45:16.067Z|1737643515969 |2025-01-23 15:45:15.969
> +0100|2025-01-23 15:45:16.069 +0100|
> |18
> 982|5009d4f3-dfac-4de2-8360-32999ee6aac7|6b371cdc-7071-4dfa-8c58-231770df4f8e|diffuser_donnees_utr|2025-01-23T14:45:16.062Z|1737643515969
> |2025-01-23 15:45:15.969 +0100|2025-01-23 15:45:39.265 +0100|{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact