> I see in the commit that added this warning that the maxPageReadBytes is not something I can actually change so I assume any solution needs to happen earlier in the chain...
I believe you can set the max-read-page-bytes address setting (alluded to in the error message and referenced in the documentation [1]). Use -1 to remove the limit completely. However, be careful with -1 as this setting was designed to help the broker avoid out-of-memory conditions. Also, you might also investigate your consumers to ensure they're acking messages within a reasonable time-frame. Acking messages in batches in a viable way to improve performance, but if the batches get too big then memory consumption starts to become a problem. > I can reproduce the crash with my dataset as well as verify that with 2.30 it still works fine. Can you confirm that you are experiencing a real "crash" of the broker (i.e. the broker shuts down completely, the JVM exits)? Based on the logs it appears you're simply seeing a WARN message and consumption is stalling. Justin [1] https://activemq.apache.org/components/artemis/documentation/latest/paging.html#configuration-2 On Fri, May 10, 2024 at 6:51 AM Christian Kurmann <c...@tere.tech> wrote: > Hi all, > Would someone be able to guide me to documentation for options to handle a > problem I've encountered testing Artemis > 2.30. > > I have a set of ~50 java microservices processing 1.5million messages which > communicate among themselves via Artemis using Openwire. > To test I have a dataset containing lots of small and a realistic amount of > large messages (> 10KB) which simulates a full prod day but can be > processed in under 1h. > > Using Artemis 2.30 I can get my data processed, however with later versions > the system hangs due to a large backlog on an important Queue which is used > by all services to write trace data. > All microservices process data in batches of max 100msgs while reading and > writing to the queues. > > The main issue is that when this happens, both the reading and writing > clients of this queue simply hang waiting for Artemis to return ACK which > never comes. > During this time Artemis does not log anything suspicious. However > 10 > minutes later I do see paging starting and I get the warning: > > 2024-05-10 09:26:12,605 INFO [io.hawt.web.auth.LoginServlet] Hawtio login > is using 1800 sec. HttpSession timeout > 2024-05-10 09:26:12,614 INFO [io.hawt.web.auth.LoginServlet] Logging in > user: webadmin > 2024-05-10 09:26:12,906 INFO [io.hawt.web.auth.keycloak.KeycloakServlet] > Keycloak integration is disabled > 2024-05-10 09:26:12,955 INFO [io.hawt.web.proxy.ProxyServlet] Proxy servlet > is disabled > 2024-05-10 09:48:56,955 INFO [org.apache.activemq.artemis.core.server] > AMQ222038: Starting paging on address 'IMS.PRINTS.V2'; size=10280995859 > bytes (1820178 messages); maxSize=-1 bytes (-1 messages); > globalSize=10309600627 bytes (1825169 messages); globalMaxSize=10309599232 > bytes (-1 messages); > 2024-05-10 09:48:56,962 WARN > [org.apache.activemq.artemis.core.server.Queue] > AMQ224127: Message dispatch from paging is blocked. Address > IMS.PRINTS.V2/Queue IMS.PRINTS.V2 will not read any more messages from > paging until pending messages are acknowledged. There are currently 14500 > messages pending (51829364 bytes) with max reads at maxPageReadMessages(-1) > and maxPageReadBytes(20971520). Either increase reading attributes at the > address-settings or change your consumers to acknowledge more often. > 2024-05-10 09:49:24,458 INFO [org.apache.activemq.artemis.core.server] > AMQ222038: Starting paging on address 'PRINTS'; size=28608213 bytes (4992 > messages); maxSize=-1 bytes (-1 messages); globalSize=10309604144 bytes > (1825170 messages); globalMaxSize=10309599232 bytes (-1 messages); > > I see in the commit that added this warning that the maxPageReadBytes is > not something I can actually change so I assume any solution needs to > happen earlier in the chain > https://www.mail-archive.com/commits@activemq.apache.org/msg61667.html > > But I can't find anything sensible to configure to help me here. > I also am concerned about this suddenly appearing with a new Artemis > version and I can reproduce the crash with my dataset as well as verify > that with 2.30 it still works fine. > The fact that there is nothing in the logs also worries me. > Reconnecting the clients allows messages to be processed however the system > crashes again very quickly due to all the pending transactions starting > again and leading to the same issue. > > Running on Openshift using AMQ Cloud operator. > Clients connect via Openwire and using activemq-client-5.17.2.jar and > java17 > > All input is appreciated and I'm happy to run as many tests as required. > > Regards Chris >