Can you reproduce it in your test at least ? Use journal retention and look for the message your lost on the print data.
It could be misconfiguration in your cluster. There are no changes between 2.22 and 2.26 that could lead to that. Run your test with retention enabled and I can help you out figuring out what happened in your test. On Thu, Oct 20, 2022 at 7:57 AM Gašper Čefarin <gasper.cefa...@actual-it.si> wrote: > I hope you don't mind if I reply to this thread. I'd also like to report > messages getting lost. > > I've had 2 occurrence of losing messages when using simple replication (1 > live and 1 backup server). > I was using artemis v2.22.0. > I was not able to replicate the issue, and I think it happened when I > rebooted the live server. > > The messages lost were stored persistently, in a durable queue, with no > consumers online. Not sure about producers. > > All I see in the logs are warnings like these two: > > - 2022-07-01 14:52:16,282 WARN [org.apache.activemq.artemis.core.server] > AMQ222092: Connection to the backup node failed, removing replication now: > ActiveMQRemoteDisconnectException[errorType=REMOTE_DISCONNECT message=null] > > - 2022-07-01 14:52:16,295 WARN [org.apache.activemq.artemis.core.client] > AMQ212037: Connection failure to 10.108.28.52/10.108.28.52:9000 has been > detected: AMQ219015: The connection was disconnected because of server > shutdown [code=DISCONNECTED] > > The only thing that comes to mind that could be the problem is changing > the port for cluster communication from default 61616 to 9000 (i've > experienced some problems unrelated to message loss when changing the port). > > Any advice on reproducing the issue or where to look for more data > appreciated. > > -----Original Message----- > From: Clebert Suconic <clebert.suco...@gmail.com> > Sent: Wednesday, October 19, 2022 4:58 PM > To: users@activemq.apache.org > Subject: Re: Messages getting lost on Artemis 2.25 > > > To sporočilo izvira izven naše organizacije. Bodite pozorni pri vsebini in > odpiranju povezav ali prilog. > > > > > Basically I'm telling you how to investigate it.. and if you find an issue > on the broker, we will need a way to reproduce it. > > I have no other report about a message loss situation... > > (we do have situations with page-counters going wrong while paging..which > I'm working now to fix it... but no message loss). > > On Wed, Oct 19, 2022 at 10:55 AM Clebert Suconic < > clebert.suco...@gmail.com> wrote: > > > > I am not aware of any issues that would lead to message loss... > > > > Garbage Collection itself has no effect on anything regarding paging or > journal. > > > > > > Are you able to chase which message is lost on a test? > > > > > > you could use the retention feature, replay the message.. and you > > could also look on the ./artemis data print on what happened to the > > message. > > > > > > One other suggestion I could make is to use Federation instead of > > clustering. Perhaps message are stranded on the Store and forward > > queue? > > > > > > also.. you have consumers in all the nodes.. you should use clustering > > with OFF-WITH-REDISTRIBUTION, or use Federation. you should always > > favor the local consumers. > > > > On Wed, Oct 19, 2022 at 8:16 AM Walter de Boer <walterdeb...@dbso.nl> > wrote: > > > > > > All, > > > > > > This week we lost 23.000 messages in a few days time on our > > > production Cluster running Artemis 2.26.0, see our settings below. > > > We've reverted back to Artemis 2.20.0 just in case > > > > > > A few observatoins: > > > > > > * In version 2.24.0, 2.25.0 and 2.26.0 running on ZGC we noticed > > > messages being produced to a queue without errors, that we didn't > > > find in that queue. At the same time we saw incorrect counters. We > > > did restart nodes to resolve, but on one occasion the error > > > continued for some time after that, and we never found the messages > > > again. Not even when exporting the journal files. The errors showed > > > after running a few days > > > * In version 2.20.0 running on G1GC and on ZGC we did not lose any > > > messages. We did experience memory issues resulting in (to) long > > > garbage collection times every other week, maybe due to lack of JVM > > > tuning on our side. We were running 2.20 on G1GC for serveral > > > months > > > > > > We're running a symetric Cluser of 3 live/backup pairs in Docker JRE > > > (temurin) containers on VMWare CentOS7 hosts. Each live node has > > > around > > > 1.000 producers & consumers continuously. > > > > > > I hope the Artemis community can advise us in this? > > > > > > Best Regards, > > > > > > Walter > > > > > > > > > Our setup: > > > > > > * > > > **docker-compose.yaml** > > > * > > > version: "3.8" > > > > > > services: > > > artemis: > > > container_name: 'artemis' > > > network_mode: "host" > > > image: "cdplatform/activemq-artemis:2.26.0" > > > restart: 'always' > > > hostname: cjiblx8408.ato.cjib.minjus.nl > > > volumes: > > > - "/data/artemis/data:/var/lib/artemis/data" > > > - "/data/artemis/plugins:/var/lib/artemis/lib" > > > - "/data/artemis/etc:/var/lib/artemis/etc" > > > - > "/data/artemis/etc-override:/var/lib/artemis/etc-override" > > > - "/logging/artemis:/var/lib/artemis/log" > > > environment: > > > ARTEMIS_MIN_MEMORY: "14051615047" > > > ARTEMIS_MAX_MEMORY: "14051615047" > > > JAVA_XTRA_ARGS: "-XX:ActiveProcessorCount=4 -XX:+UseZGC > > > -XX:+UseDynamicNumberOfGCThreads -XX:+UseStringDeduplication " > > > BROKER_SETTINGS_FILE: "broker-settings.xml" > > > ENABLE_JMX: "true" > > > JMX_PORT: "3333" > > > ENABLE_JMX_EXPORTER: "true" > > > JMX_RMI_PORT: "1098" > > > mem_swappiness: 0 > > > memswap_limit: 20073735782 > > > deploy: > > > resources: > > > limits: > > > memory: "20073735782" > > > reservations: > > > memory: "20073735782" > > > > > > *Command line options:* > > > > > > /opt/java/openjdk/bin/java > > > > > -javaagent:/opt/jmx-exporter/jmx_prometheus_javaagent.jar=9404:/opt/jmx-exporter/etc/jmx-exporter-config.yaml > > > -Xmx17564518809 > > > -Xms17564518809 > > > -Dcom.sun.management.jmxremote.authenticate=true > > > > > -Dcom.sun.management.jmxremote.password.file=/var/lib/artemis/etc/jmxremote.password > > > > > -Dcom.sun.management.jmxremote.access.file=/var/lib/artemis/etc/jmxremote.access > > > -Dcom.sun.management.jmxremote.port=3333 > > > -Dcom.sun.management.jmxremote.rmi.port=1098 > > > -Dcom.sun.management.jmxremote.ssl=false > > > -Djava.net.preferIPv4Addresses=true > > > -Djava.net.preferIPv4Stack=true > > > -XX:ActiveProcessorCount=4 > > > -XX:+UseZGC > > > -XX:+UseDynamicNumberOfGCThreads > > > -XX:+UseStringDeduplication > > > -Dhawtio.realm=activemq > > > -Dhawtio.offline=true > > > -Dhawtio.role=gs-auth-Artemis_Admin,gs-auth-Artemis_User > > > > > -DPrincipalClasses=org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal > > > > -Djolokia.policyLocation=file:/var/lib/artemis/etc/jolokia-access.xml > > > -Dcom.sun.management.jmxremote.ssl=false > > > -Xbootclasspath/a:/var/lib/artemis/lib/javax.json-1.1.4.jar > > > -Dhawtio.role=gs-auth-Artemis_Admin,gs-auth-Artemis_User > > > > > -Xbootclasspath/a:/opt/apache-artemis/lib/jboss-logmanager-2.1.18.Final.jar:/opt/apache-artemis/lib/wildfly-common-1.5.2.Final.jar:/opt/apache-artemis/lib/javax.json-1.1.4.jar > > > -Djava.security.auth.login.config=/var/lib/artemis/etc/login.config > > > -classpath /opt/apache-artemis/lib/artemis-boot.jar > > > -Dartemis.home=/opt/apache-artemis > > > -Dartemis.instance=/var/lib/artemis > > > -Djava.library.path=/opt/apache-artemis/bin/lib/linux-x86_64 > > > -Djava.io.tmpdir=/var/lib/artemis/tmp > > > -Ddata.dir=/var/lib/artemis/data > > > -Dartemis.instance.etc=/var/lib/artemis/etc > > > -Djava.util.logging.manager=org.jboss.logmanager.LogManager > > > > -Dlogging.configuration=file:/var/lib/artemis/etc//logging.properties > > > -Dartemis.default.sensitive.string.codec.key= > > > org.apache.activemq.artemis.boot.Artemis > > > run > > > > > > *broker-settings.xml**:* > > > > > > <core xmlns="urn:activemq:core"> > > > <global-max-size>2810323009</global-max-size> > > > <name>xxxxxxx.xxxxxx.xx</name> > > > <graceful-shutdown-enabled > > > xmlns="urn:activemq:core">true</graceful-shutdown-enabled> > > > <graceful-shutdown-timeout > > > xmlns="urn:activemq:core">10000</graceful-shutdown-timeout> > > > <management-address > > > xmlns="urn:activemq:core">activemq.management</management-address> > > > <persistence-enabled > > > xmlns="urn:activemq:core">true</persistence-enabled> > > > <id-cache-size xmlns="urn:activemq:core">20000</id-cache-size> > > > <persist-id-cache > xmlns="urn:activemq:core">true</persist-id-cache> > > > <paging-directory > > > xmlns="urn:activemq:core">data/paging</paging-directory> > > > <bindings-directory > > > xmlns="urn:activemq:core">data/bindings</bindings-directory> > > > <large-messages-directory > > > > xmlns="urn:activemq:core">data/large-messages</large-messages-directory> > > > <journal-directory > > > xmlns="urn:activemq:core">data/journal</journal-directory> > > > <journal-type xmlns="urn:activemq:core">ASYNCIO</journal-type> > > > <journal-datasync > xmlns="urn:activemq:core">true</journal-datasync> > > > <journal-min-files > xmlns="urn:activemq:core">2</journal-min-files> > > > <journal-pool-files > xmlns="urn:activemq:core">10</journal-pool-files> > > > <journal-device-block-size > > > xmlns="urn:activemq:core">4096</journal-device-block-size> > > > <journal-file-size > xmlns="urn:activemq:core">10MB</journal-file-size> > > > <journal-buffer-size > > > xmlns="urn:activemq:core">490KB</journal-buffer-size> > > > <journal-compact-min-files > > > xmlns="urn:activemq:core">10</journal-compact-min-files> > > > <journal-compact-percentage > > > xmlns="urn:activemq:core">30</journal-compact-percentage> > > > <journal-lock-acquisition-timeout > > > xmlns="urn:activemq:core">-1</journal-lock-acquisition-timeout> > > > <journal-file-open-timeout > > > xmlns="urn:activemq:core">5</journal-file-open-timeout> > > > <journal-sync-non-transactional > > > xmlns="urn:activemq:core">true</journal-sync-non-transactional> > > > <journal-sync-transactional > > > xmlns="urn:activemq:core">true</journal-sync-transactional> > > > <disk-scan-period > xmlns="urn:activemq:core">5000</disk-scan-period> > > > <max-disk-usage xmlns="urn:activemq:core">90</max-disk-usage> > > > <critical-analyzer > xmlns="urn:activemq:core">true</critical-analyzer> > > > <critical-analyzer-timeout > > > xmlns="urn:activemq:core">120000</critical-analyzer-timeout> > > > <critical-analyzer-check-period > > > xmlns="urn:activemq:core">60000</critical-analyzer-check-period> > > > <critical-analyzer-policy > > > xmlns="urn:activemq:core">LOG</critical-analyzer-policy> > > > <page-sync-timeout > > > xmlns="urn:activemq:core">548000</page-sync-timeout> > > > <acceptors xmlns="urn:activemq:core"> > > > <acceptor > > > name="artemis">tcp:// > 0.0.0.0:61616?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;amqpMinLargeMessageSize=102400;connectionsAllowed=1536;directDeliver=false;useEpoll=true;amqpCredits=1000;amqpLowCredits=300;amqpDuplicateDetection=true;protocols=CORE,AMQP,STOMP,HORNETQ,OPENWIRE > ;</acceptor> > > > </acceptors> > > > <connectors xmlns="urn:activemq:core"> > > > <connector > name="artemis">tcp://xxxxxxx.xxxxxx.xx:61616</connector> > > > </connectors> > > > <cluster-user xmlns="urn:activemq:core">artemis</cluster-user> > > > <cluster-password > > > xmlns="urn:activemq:core">xxxxxxxx</cluster-password> > > > <broadcast-groups xmlns="urn:activemq:core"> > > > <broadcast-group name="bg-group1"> > > > <group-address>231.7.7.10</group-address> > > > <group-port>9876</group-port> > > > <broadcast-period>5000</broadcast-period> > > > <connector-ref>artemis</connector-ref> > > > </broadcast-group> > > > </broadcast-groups> > > > <discovery-groups xmlns="urn:activemq:core"> > > > <discovery-group name="dg-group1"> > > > <group-address>231.7.7.10</group-address> > > > <group-port>9876</group-port> > > > <refresh-timeout>10000</refresh-timeout> > > > </discovery-group> > > > </discovery-groups> > > > <cluster-connections xmlns="urn:activemq:core"> > > > <cluster-connection name="artemis-ato"> > > > <connector-ref>artemis</connector-ref> > > > <retry-interval>2000</retry-interval> > > > <initial-connect-attempts>1000</initial-connect-attempts> > > > <reconnect-attempts>1000</reconnect-attempts> > > > <message-load-balancing>ON_DEMAND</message-load-balancing> > > > <max-hops>1</max-hops> > > > <discovery-group-ref discovery-group-name="dg-group1"/> > > > </cluster-connection> > > > </cluster-connections> > > > <ha-policy xmlns="urn:activemq:core"> > > > <replication> > > > <master> > > > <check-for-live-server>true</check-for-live-server> > > > <vote-on-replication-failure>true</vote-on-replication-failure> > > > <group-name>ato-hapair-1</group-name> > > > </master> > > > </replication> > > > </ha-policy> > > > <metrics xmlns="urn:activemq:core"> > > > <jvm-memory>true</jvm-memory> > > > <jvm-gc>true</jvm-gc> > > > <jvm-threads>true</jvm-threads> > > > <plugin > > > > > class-name="org.apache.activemq.artemis.core.server.metrics.plugins.ArtemisPrometheusMetricsPlugin"/> > > > </metrics> > > > <security-settings xmlns="urn:activemq:core"> > > > <security-setting match="activemq.management"> > > > <permission type="manage" roles="amq,service"/> > > > </security-setting> > > > <security-setting match="#"> > > > <permission type="manage" roles="amq,service"/> > > > <permission type="send" roles="amq,service,b2bi"/> > > > <permission type="consume" roles="amq,service,b2bi"/> > > > <permission type="browse" roles="amq,service"/> > > > <permission type="createAddress" roles="amq,service"/> > > > <permission type="deleteAddress" roles="amq,service"/> > > > <permission type="createDurableQueue" roles="amq,service"/> > > > <permission type="deleteDurableQueue" roles="amq,service"/> > > > <permission type="createNonDurableQueue" > roles="amq,service"/> > > > <permission type="deleteNonDurableQueue" > roles="amq,service"/> > > > </security-setting> > > > <role-mapping from="gs-auth-Artemis_Admin" to="amq"/> > > > <role-mapping from="gs-auth-Artemis_User" to="service"/> > > > </security-settings> > > > <address-settings xmlns="urn:activemq:core"> > > > <address-setting match="activemq.management#"> > > > <dead-letter-address>DLQ</dead-letter-address> > > > <expiry-address>ExpiryQueue</expiry-address> > > > <redelivery-delay>0</redelivery-delay> > > > > <message-counter-history-day-limit>10</message-counter-history-day-limit> > > > <max-size-bytes>-1</max-size-bytes> > > > <max-size-messages>-1</max-size-messages> > > > <address-full-policy>PAGE</address-full-policy> > > > <auto-create-queues>true</auto-create-queues> > > > <auto-create-addresses>true</auto-create-addresses> > > > <auto-create-jms-queues>true</auto-create-jms-queues> > > > <auto-create-jms-topics>true</auto-create-jms-topics> > > > </address-setting> > > > <address-setting match="#"> > > > <dead-letter-address>DLQ</dead-letter-address> > > > <expiry-address>ExpiryQueue</expiry-address> > > > <redelivery-delay>0</redelivery-delay> > > > > <message-counter-history-day-limit>10</message-counter-history-day-limit> > > > <max-size-bytes>-1</max-size-bytes> > > > <max-size-messages>-1</max-size-messages> > > > <address-full-policy>PAGE</address-full-policy> > > > <auto-create-queues>true</auto-create-queues> > > > <auto-create-addresses>true</auto-create-addresses> > > > <auto-create-jms-queues>true</auto-create-jms-queues> > > > <auto-create-jms-topics>true</auto-create-jms-topics> > > > </address-setting> > > > <address-setting match="jms.#"> > > > <dead-letter-address>DLQ</dead-letter-address> > > > <expiry-address>ExpiryQueue</expiry-address> > > > <max-delivery-attempts>5</max-delivery-attempts> > > > <redelivery-delay>500</redelivery-delay> > > > <redelivery-delay-multiplier>1.5</redelivery-delay-multiplier> > > > > > <redelivery-collision-avoidance-factor>0.5</redelivery-collision-avoidance-factor> > > > <redistribution-delay>30000</redistribution-delay> > > > <send-to-dla-on-no-route>true</send-to-dla-on-no-route> > > > <max-size-bytes>-1</max-size-bytes> > > > <max-size-messages>-1</max-size-messages> > > > <address-full-policy>PAGE</address-full-policy> > > > > <message-counter-history-day-limit>10</message-counter-history-day-limit> > > > <auto-create-queues>false</auto-create-queues> > > > <auto-delete-queues>false</auto-delete-queues> > > > <auto-delete-created-queues>false</auto-delete-created-queues> > > > <auto-delete-queues-delay>30000</auto-delete-queues-delay> > > > <config-delete-queues>OFF</config-delete-queues> > > > <auto-create-addresses>false</auto-create-addresses> > > > <auto-delete-addresses>false</auto-delete-addresses> > > > <auto-delete-addresses-delay>30000</auto-delete-addresses-delay> > > > <config-delete-addresses>OFF</config-delete-addresses> > > > </address-setting> > > > <address-setting match="activemq.notifications"> > > > <max-size-bytes>-1</max-size-bytes> > > > <max-size-messages>-1</max-size-messages> > > > <address-full-policy>PAGE</address-full-policy> > > > </address-setting> > > > <address-setting match="jms.queue.#"> > > > > <default-address-routing-type>ANYCAST</default-address-routing-type> > > > <default-queue-routing-type>ANYCAST</default-queue-routing-type> > > > </address-setting> > > > <address-setting match="jms.topic.#"> > > > > <default-address-routing-type>MULTICAST</default-address-routing-type> > > > <default-queue-routing-type>MULTICAST</default-queue-routing-type> > > > </address-setting> > > > </address-settings> > > > <addresses xmlns="urn:activemq:core"> > > > <address name="DLQ"> > > > <anycast> > > > <queue name="DLQ"/> > > > </anycast> > > > </address> > > > <address name="ExpiryQueue"> > > > <anycast> > > > <queue name="ExpiryQueue"/> > > > </anycast> > > > </address> > > > </addresses> > > > <broker-plugins xmlns="urn:activemq:core"> > > > <broker-plugin > > > > > class-name="org.apache.activemq.artemis.core.server.plugin.impl.LoggingActiveMQServerPlugin"> > > > <property key="LOG_ALL_EVENTS" value="false"/> > > > <property key="LOG_CONNECTION_EVENTS" value="false"/> > > > <property key="LOG_SESSION_EVENTS" value="false"/> > > > <property key="LOG_CONSUMER_EVENTS" value="false"/> > > > <property key="LOG_DELIVERING_EVENTS" value="false"/> > > > <property key="LOG_SENDING_EVENTS" value="false"/> > > > <property key="LOG_INTERNAL_EVENTS" value="false"/> > > > </broker-plugin> > > > </broker-plugins> > > > </core> > > > > > > > > > > > > > > > -- > > Clebert Suconic > > > > -- > Clebert Suconic > > NOTICE - NOT TO BE REMOVED. > This e-mail and any attachments are confidential and may contain legally > privileged information and/or copyright material of Actual I.T. or third > parties. If you are not an authorised recipient of this e-mail, please > contact Actual I.T. immediately by return email or by telephone or > facsimile on the above numbers. > You should not read, print, re-transmit, store or act in reliance on this > email or any attachments and you should destroy all copies of them. > -- Clebert Suconic