[
https://issues.apache.org/jira/browse/QPIDJMS-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16422650#comment-16422650
]
ASF subversion and git services commented on QPIDJMS-365:
---------------------------------------------------------
Commit b22a1e7ad898992a60ffb8ef2c66f58735d8ba1e in qpid-broker-j's branch
refs/heads/master from [~alex.rufous]
[ https://git-wip-us.apache.org/repos/asf?p=qpid-broker-j.git;h=b22a1e7 ]
NO-JIRA: [Broker-J] [System Tests] Remove workaround for qpid jms client not
respecting reconnectDelay (QPIDJMS-365)
> failover.reconnectDelay not applied between attempts if peer remote-closes
> during connection
> --------------------------------------------------------------------------------------------
>
> Key: QPIDJMS-365
> URL: https://issues.apache.org/jira/browse/QPIDJMS-365
> Project: Qpid JMS
> Issue Type: Bug
> Components: qpid-jms-client
> Reporter: Keith Wall
> Assignee: Timothy Bish
> Priority: Major
> Fix For: 0.31.0
>
>
> When using Broker-J's High Availability feature, the client's failover
> abilities are used to allow the client to discover which Broker in the HA
> group has the master role. The client tries each Broker on the failover list
> until until one successfully responds to the {{Open}} indicating that it is
> current master.
>
> When a node is not in the master role, it gracefully closes the AMQP
> connection by sending a {{Close}} performative. During election periods, it
> is normal for all nodes in the HA group to respond with the {{Close}} until
> the election concludes.
> {noformat}
> Close{error=Error{condition=amqp:not-found, description='Virtual host
> 'localhost' is not active', info=null}}
> {noformat}
> The QpidJMS Client failover options includes a {{failover.reconnectDelay}}
> which "Controls the delay between successive reconnection attempts".
> However it appears that the reconnection delay is only applied between
> attempts when a connection fails owing to a 'transport' level failure
> (connection refused etc). If the connection fails at the AMQP level, the
> delay is not applied.
> This is impactful to the HA use-case for Broker-J. It means that during
> periods of reelection, the client, tightly spins in the reconnection loop,
> excessively consuming system resources. It is also necessary to ensure that
> {{failover.maxReconnectAttempts}} is set sufficiently large to allow for an
> election period to conclude successfully. Whilst the user could use
> unlimited reconnection attempts, this is unpleasant as it means the system
> won't fail in the case where the election does not conclude within a
> reasonable time period.
> Extract of TRACE level logging from
> {{org.apache.qpid.jms.provider.failover.FailoverProvider}} for the case when
> a AMQP connection is closed gracefully ({{Close}} performative):
> {noformat}
> 2018-03-13 11:04:54,951 [lization thread] - DEBUG FailoverProvider
> - Failover: the provider reports failure: Connection closed by external
> action [condition = amqp:connection:forced]
> 2018-03-13 11:04:54,951 [lization thread] - DEBUG FailoverProvider
> - handling Provider failure: Connection closed by external action
> [condition = amqp:connection:forced]
> 2018-03-13 11:04:54,951 [lization thread] - TRACE FailoverProvider
> - stack
> java.io.IOException: Connection closed by external action [condition =
> amqp:connection:forced]
> at
> org.apache.qpid.jms.util.IOExceptionSupport.create(IOExceptionSupport.java:45)
> at
> org.apache.qpid.jms.provider.amqp.AmqpProvider.fireProviderException(AmqpProvider.java:1086)
> at
> org.apache.qpid.jms.provider.amqp.AmqpAbstractResource.closeResource(AmqpAbstractResource.java:182)
> at
> org.apache.qpid.jms.provider.amqp.AmqpAbstractResource.processRemoteClose(AmqpAbstractResource.java:262)
> at
> org.apache.qpid.jms.provider.amqp.AmqpProvider.processUpdates(AmqpProvider.java:949)
> at
> org.apache.qpid.jms.provider.amqp.AmqpProvider.access$1900(AmqpProvider.java:104)
> at
> org.apache.qpid.jms.provider.amqp.AmqpProvider$17.run(AmqpProvider.java:831)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: javax.jms.JMSException: Connection closed by external action
> [condition = amqp:connection:forced]
> at
> org.apache.qpid.jms.provider.amqp.AmqpSupport.convertToException(AmqpSupport.java:164)
> at
> org.apache.qpid.jms.provider.amqp.AmqpSupport.convertToException(AmqpSupport.java:117)
> ... 11 more
> 2018-03-13 11:04:54,954 [ connect thread] - DEBUG FailoverProvider
> - Connection attempt:[1] to: amqp://localhost:5672 in-progress
> 2018-03-13 11:04:55,003 [lization thread] - DEBUG FailoverProvider
> - Signalling connection recovery: AmqpProvider: localhost:5672
> 2018-03-13 11:04:55,007 [lization thread] - DEBUG FailoverProvider
> - handling Provider failure: Virtual host 'localhost' is not active
> [condition = amqp:not-found]
> {noformat}
> Extract of TRACE level logging from
> {{org.apache.qpid.jms.provider.failover.FailoverProvider}} for the case when
> a AMQP connection is closed at the transport level (Connection Refused):
> {noformat}
> 2018-03-13 11:03:47,069 [lization thread] - TRACE FailoverProvider
> - stack
> java.io.IOException: Transport connection remotely closed.
> at
> org.apache.qpid.jms.provider.amqp.AmqpProvider$20.run(AmqpProvider.java:901)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2018-03-13 11:03:47,089 [ connect thread] - DEBUG FailoverProvider
> - Connection attempt:[1] to: amqp://localhost:5672 in-progress
> 2018-03-13 11:03:47,095 [ connect thread] - INFO FailoverProvider
> - Connection attempt:[1] to: amqp://localhost:5672 failed
> 2018-03-13 11:03:47,095 [ connect thread] - TRACE FailoverProvider
> - Next reconnect attempt will be in 10000 milliseconds
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]