[ https://issues.apache.org/jira/browse/AMQ-6894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrei Shakirin updated AMQ-6894: --------------------------------- Comment: was deleted (was: The following client code configured with failover transport will kill the server after some time: {code} public void onMessage(Message message) { ... StringBuffer bigBuffer = new StringBuffer(Short.MAX_VALUE); for(int i = 1; i < Short.MAX_VALUE; i++) { bigBuffer.append("1"); } throw new RuntimeException(bigBuffer.toString()); } {code} I see two AMQ problems here: 1) Exception message have to be controlled and limited before set in dlqDeliveryFailureCause: some exceptions coming from thirdparty and not under client handler control. Following code in ActiveMQMessageConsumer.rollback() have to be fixed: {code} ack.setPoisonCause(new Throwable("Exceeded redelivery policy limit:" + redeliveryPolicy + ", cause:" + lastMd.getRollbackCause(), lastMd.getRollbackCause())); {code} It is necessary to check length of lastMd.getRollbackCause().toString() and cut to reasonable length 2) Failover reconnection by EVERY IOException is IMO very dangerous) > Excessive number of connections by failover transport with priorityBackup > ------------------------------------------------------------------------- > > Key: AMQ-6894 > URL: https://issues.apache.org/jira/browse/AMQ-6894 > Project: ActiveMQ > Issue Type: Bug > Components: Broker > Affects Versions: 5.14.5 > Reporter: Andrei Shakirin > Assignee: Jean-Baptiste Onofré > Priority: Major > Attachments: activemq-part.zip > > > My clients connect to AMQ with this connection string: > (tcp://amq1:61616,tcp://amq2:61616)?randomize=false&priorityBackup=true > It works - for some time. But sooner or later my AMQ server becomes > unresponsive because the host it runs on runs out of resources (threads). > Suddenly AMQ Server log explodes with the messages like: > {code} > 2018-01-26 09:26:16,909 | WARN | Failed to register MBean > org.apache.activemq > :type=Broker,brokerName=activemq-vm-primary,connector=clientConnectors,connect > orName=default,connectionViewType=clientId,connectionName=ID_ca8f70e115d0-3708 > 7-1516883370639-0_22 | org.apache.activemq.broker.jmx.ManagedTransportConnecti > on | ActiveMQ Transport: tcp:///172.10.7.56:55548@61616 > 2018-01-26 09:26:21,375 | WARN | Ignoring ack received before dispatch; > result of failover with an outstanding ack. Acked messages will be replayed > if present on this broker. Ignored ack: MessageAck \{commandId = 157, > responseRequired = false, ackType = 2, consumerId = > ID:ca8f70e115d0-37087-1516883370639-1:22:10:1, firstMessageId = > ID:a95345a9c0df-33771-1516883685728-1:17:5:1:23, lastMessageId = > ID:a95345a9c0df-33771-1516883685728-1:17:5:1:23, destination = > queue://MY_QUEUE_OUT, transactionId = null, messageCount = 1, poisonCause = > null} | org.apache.activemq.broker.region.PrefetchSubscription | ActiveMQ > Transport: tcp:///172.16.6.56:55464@61616 > 2018-01-26 09:26:39,211 | WARN | Transport Connection to: > tcp://172.10.6.56:55860 failed: java.net.SocketException: Connection reset | > org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > InactivityMonitor Worker > 2018-01-26 09:26:47,175 | WARN | Transport Connection to: > tcp://172.10.6.56:57012 failed: java.net.SocketException: Broken pipe (Write > failed) | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ > InactivityMonitor Worker > {code} > After short period of time AMQ server comes out of resources with > "java.lang.OutOfMemoryError: unable to create new native thread" error. The > AMQ service process in this case has a huge number of threads (some thousands) > > The client side log contains a lot of reconnection attempts messages like: > {code} > 2018-01-26 00:10:31,387 WARN > [\{{bundle.name,org.apache.activemq.activemq-osgi}{bundle.version,5.14.1}\{bundle.id,181}}] > [null] org.apache.activemq.transport.failover.FailoverTransport > Failed to connect to [tcp://activemq-vm-primary:61616, > tcp://activemq-vm-secondary:61616] after: 810 attempt(s) continuing to retry. > {code} > It seems that client creates a huge number of connections by failover retry > and after some time kills the server. > Issue looks very similar to described in > https://issues.apache.org/jira/browse/AMQ-6603, however server isn't > configured with access control settings. > I found the description of similar problem into > [http://activemq.2283324.n4.nabble.com/ActiveMQ-5-2-OutOfMemoryError-unable-to-create-new-native-thread-td2366585.html], > but without concrete suggestion. > > Part of server log is attached -- This message was sent by Atlassian JIRA (v7.6.3#76005)