[
https://issues.apache.org/jira/browse/ARTEMIS-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662795#comment-16662795
]
Justin Bertram commented on ARTEMIS-2147:
-----------------------------------------
This looks similar to (but not exactly the same as) ARTEMIS-1818.
> Fail over and Fail back race condition with dynamic queues
> ----------------------------------------------------------
>
> Key: ARTEMIS-2147
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2147
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Components: Broker
> Affects Versions: 2.4.0, 2.5.0, 2.6.3
> Reporter: Derek Wilhelm
> Priority: Major
>
> There appears to be a race condition when using dynamically created queues
> with replication based fail over and fail back and using the CORE jms client.
> When a fail over and/or fail back occurs the server will log an exception:
> `ERROR [org.apache.activemq.artemis.core.server] AMQ224016: Caught exception:
> ActiveMQNonExistentQueueException[errorType=QUEUE_DOES_NOT_EXIST
> message=AMQ119017: Queue test.queue does not exist]`
> The client never sees an exception (after the initial connection failure) and
> appears to believe that the re-connection was a success. However, the client
> will no longer receive messages that are sent to the queue. If you debug
> through the code upon a fail over at the part where the consumer is being
> created you will not see the problem occur unless you set the break point
> after the address lookup at which point it will occasionally fail. Hence the
> belief that this is a race condition.
>
> Steps to reproduce:
> 1. Create master server with replication, check-for-live-server=true
> 2. Create backup server with replication, allow-failback=true,
> failback-delay=5000
> 3. Start master server
> 4. Start backup server
> 5. Create a consumer on a dynamically defined, named queue (e.g. test.queue)
> using the artemis core jms client
> 6. Create a producer from another connection on the same queue and start
> sending periodic messages
> 7. Stop the master server
> - Failover to the backup will take place. The client will log the
> connection failure
> - The error may occur at this point where the backup server will log the
> aforementioned exception - If the error does occur, the consumer will stop
> receiving new messages
> 8. Start the master server
> - Fail back to the master server will take place once it has started
> - The client will log the connection failure once the master takes over
> - The error may occur at this point where the master server will log the
> aforementioned exception - If the error does occur, the consumer will stop
> receiving new messages
> 9. If the ActiveMQNonExistentQueueException does not occur, repeat steps 7
> and 8.
>
> The exception most often occurs during the fail back to the master server and
> often within only 1 or 2 fail back attempts. This has been seen on 2.4.0,
> 2.5.0, 2.6.3, and 2.7.0-SNAPSHOT
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)