Hi all,
Would someone be able to guide me to documentation for options to handle a
problem I've encountered testing Artemis > 2.30.

I have a set of ~50 java microservices processing 1.5million messages which
communicate among themselves via Artemis using Openwire.
To test I have a dataset containing lots of small and a realistic amount of
large messages (> 10KB) which simulates a full prod day but can be
processed in under 1h.

Using Artemis 2.30 I can get my data processed, however with later versions
the system hangs due to a large backlog on an important Queue which is used
by all services to write trace data.
All microservices process data in batches of max 100msgs while reading and
writing to the queues.

The main issue is that when this happens, both the reading and writing
clients of this queue simply hang waiting for Artemis to return ACK which
never comes.
During this time Artemis does not log anything suspicious. However > 10
minutes later I do see paging starting and I get the warning:

2024-05-10 09:26:12,605 INFO [io.hawt.web.auth.LoginServlet] Hawtio login
is using 1800 sec. HttpSession timeout
2024-05-10 09:26:12,614 INFO [io.hawt.web.auth.LoginServlet] Logging in
user: webadmin
2024-05-10 09:26:12,906 INFO [io.hawt.web.auth.keycloak.KeycloakServlet]
Keycloak integration is disabled
2024-05-10 09:26:12,955 INFO [io.hawt.web.proxy.ProxyServlet] Proxy servlet
is disabled
2024-05-10 09:48:56,955 INFO [org.apache.activemq.artemis.core.server]
AMQ222038: Starting paging on address 'IMS.PRINTS.V2'; size=10280995859
bytes (1820178 messages); maxSize=-1 bytes (-1 messages);
globalSize=10309600627 bytes (1825169 messages); globalMaxSize=10309599232
bytes (-1 messages);
2024-05-10 09:48:56,962 WARN [org.apache.activemq.artemis.core.server.Queue]
AMQ224127: Message dispatch from paging is blocked. Address
IMS.PRINTS.V2/Queue IMS.PRINTS.V2 will not read any more messages from
paging until pending messages are acknowledged. There are currently 14500
messages pending (51829364 bytes) with max reads at maxPageReadMessages(-1)
and maxPageReadBytes(20971520). Either increase reading attributes at the
address-settings or change your consumers to acknowledge more often.
2024-05-10 09:49:24,458 INFO [org.apache.activemq.artemis.core.server]
AMQ222038: Starting paging on address 'PRINTS'; size=28608213 bytes (4992
messages); maxSize=-1 bytes (-1 messages); globalSize=10309604144 bytes
(1825170 messages); globalMaxSize=10309599232 bytes (-1 messages);

I see in the commit that added this warning that the maxPageReadBytes is
not something I can actually change so I assume any solution needs to
happen earlier in the chain
https://www.mail-archive.com/commits@activemq.apache.org/msg61667.html

But I can't find anything sensible to configure to help me here.
I also am concerned about this suddenly appearing with a new Artemis
version and I can reproduce the crash with my dataset as well as verify
that with 2.30 it still works fine.
The fact that there is nothing in the logs also worries me.
Reconnecting the clients allows messages to be processed however the system
crashes again very quickly due to all the pending transactions starting
again and leading to the same issue.

Running on Openshift using AMQ Cloud operator.
Clients connect via Openwire and using activemq-client-5.17.2.jar and java17

All input is appreciated and I'm happy to run as many tests as required.

Regards Chris

Reply via email to