Keith Wall created QPIDJMS-365: ---------------------------------- Summary: [Failover] failover.reconnectDelay not applied between connection attempts if peer Closes gracefully Key: QPIDJMS-365 URL: https://issues.apache.org/jira/browse/QPIDJMS-365 Project: Qpid JMS Issue Type: Bug Components: qpid-jms-client Reporter: Keith Wall
When using Broker-J's High Availability feature, the client's failover abilities are used to allow the client to discover which Broker in the HA group has the master role. The client tries each Broker on the failover list until until one successfully responds to the {{Open}} indicating that it is current master. When a node is not in the master role, it gracefully closes the AMQP connection by sending a {{Close}} performative. During election periods, it is normal for all nodes in the HA group to respond with the {{Close}} until the election concludes. {noformat} Close{error=Error{condition=amqp:not-found, description='Virtual host 'localhost' is not active', info=null}} {noformat} The QpidJMS Client failover options includes a {{failover.reconnectDelay}} which "Controls the delay between successive reconnection attempts". However it appears that the reconnection delay is only applied between attempts when a connection fails owing to a 'transport' level failure (connection refused etc). If the connection fails at the AMQP level, the delay is not applied. This is impactful to the HA use-case for Broker-J. It means that during periods of reelection, the client, tightly spins in the reconnection loop, excessively consuming system resources. It is also necessary to ensure that {{failover.maxReconnectAttempts}} is set sufficiently large to allow for an election period to conclude successfully. Whilst the user could use unlimited reconnection attempts, this is unpleasant as it means the system won't fail in the case where the election does not conclude within a reasonable time period. Extract of TRACE level logging from {{org.apache.qpid.jms.provider.failover.FailoverProvider}} for the case when a AMQP connection is closed gracefully ({{Close}} performative): {noformat} 2018-03-13 11:04:54,951 [lization thread] - DEBUG FailoverProvider - Failover: the provider reports failure: Connection closed by external action [condition = amqp:connection:forced] 2018-03-13 11:04:54,951 [lization thread] - DEBUG FailoverProvider - handling Provider failure: Connection closed by external action [condition = amqp:connection:forced] 2018-03-13 11:04:54,951 [lization thread] - TRACE FailoverProvider - stack java.io.IOException: Connection closed by external action [condition = amqp:connection:forced] at org.apache.qpid.jms.util.IOExceptionSupport.create(IOExceptionSupport.java:45) at org.apache.qpid.jms.provider.amqp.AmqpProvider.fireProviderException(AmqpProvider.java:1086) at org.apache.qpid.jms.provider.amqp.AmqpAbstractResource.closeResource(AmqpAbstractResource.java:182) at org.apache.qpid.jms.provider.amqp.AmqpAbstractResource.processRemoteClose(AmqpAbstractResource.java:262) at org.apache.qpid.jms.provider.amqp.AmqpProvider.processUpdates(AmqpProvider.java:949) at org.apache.qpid.jms.provider.amqp.AmqpProvider.access$1900(AmqpProvider.java:104) at org.apache.qpid.jms.provider.amqp.AmqpProvider$17.run(AmqpProvider.java:831) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: javax.jms.JMSException: Connection closed by external action [condition = amqp:connection:forced] at org.apache.qpid.jms.provider.amqp.AmqpSupport.convertToException(AmqpSupport.java:164) at org.apache.qpid.jms.provider.amqp.AmqpSupport.convertToException(AmqpSupport.java:117) ... 11 more 2018-03-13 11:04:54,954 [ connect thread] - DEBUG FailoverProvider - Connection attempt:[1] to: amqp://localhost:5672 in-progress 2018-03-13 11:04:55,003 [lization thread] - DEBUG FailoverProvider - Signalling connection recovery: AmqpProvider: localhost:5672 2018-03-13 11:04:55,007 [lization thread] - DEBUG FailoverProvider - handling Provider failure: Virtual host 'localhost' is not active [condition = amqp:not-found] {noformat} Extract of TRACE level logging from {{org.apache.qpid.jms.provider.failover.FailoverProvider}} for the case when a AMQP connection is closed at the transport level (Connection Refused): {noformat} 2018-03-13 11:03:47,069 [lization thread] - TRACE FailoverProvider - stack java.io.IOException: Transport connection remotely closed. at org.apache.qpid.jms.provider.amqp.AmqpProvider$20.run(AmqpProvider.java:901) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2018-03-13 11:03:47,089 [ connect thread] - DEBUG FailoverProvider - Connection attempt:[1] to: amqp://localhost:5672 in-progress 2018-03-13 11:03:47,095 [ connect thread] - INFO FailoverProvider - Connection attempt:[1] to: amqp://localhost:5672 failed 2018-03-13 11:03:47,095 [ connect thread] - TRACE FailoverProvider - Next reconnect attempt will be in 10000 milliseconds {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org