[jira] [Commented] (ARTEMIS-2808) Artemis HA with shared storage strategy does not reconnect with shared storage if reconnection happens at shared storage

Karan Aggarwal (Jira) Wed, 24 Jun 2020 02:21:02 -0700


    [ 
https://issues.apache.org/jira/browse/ARTEMIS-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17143691#comment-17143691
 ]


Karan Aggarwal commented on ARTEMIS-2808:
-----------------------------------------

Thanks [~jbertram] for the reply.

 
Tested 2 scenarios in the latest version(i.e. 2.13.0) and we faced issues there.
The Scenarios and observations are listed below: # Master having live lock and 
shared storage outage happens.
_Observation here_ : Master went down while Slave could not get the live lock 
even after shared storage became accessible.
 # Slave having live lock and shared storage outage happens.
_Observation here_ : Slave went down while Master could not get the live lock 
even after shared storage became accessible.

Please find attached the zip files for respective scenarios.

Each of the zip file contain following:
 * broker xml files used for master and slave server.
 * Debug logs
 * Multiple thread dumps for the server which hangs after shared storage is 
accessible.

[^Scenario_1.zip][^Scenario_2.zip]

 

^Let me know if you require more information.^

> Artemis HA with shared storage strategy does not reconnect with shared 
> storage if reconnection happens at shared storage
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: ARTEMIS-2808
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2808
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>    Affects Versions: 2.11.0
>         Environment: Windows 10
>            Reporter: Karan Aggarwal
>            Priority: Blocker
>         Attachments: Scenario_1.zip, Scenario_2.zip
>
>
> We verified the behavior of Artemis HA by bringing down the shared storage 
> (VM) while run is in progress and here is the observation: 
> *Scenario:*
>  * When Artemis services are up and running and run is in progress we 
> restarted the machine hosting the shared storage
>  * Shared storage was back up in 5 mins
>  * Both Artemis master and slave did not connect back to the shared storage
>  * We tried stopping the Artemis brokers. The slave stopped, but the master 
> did not stop. We had to kill the process.
>  * We tried to start the Artemis brokers. The master did not start up at all. 
> The slave started successfully.
>  * We restarted the master Artemis server. Server started successfully and 
> acquired back up.
> Shared Storage type: NFS
> Impact: The run is stopped and Artemis servers needs to be started again 
> every time shared storage connection goes down momentarily.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARTEMIS-2808) Artemis HA with shared storage strategy does not reconnect with shared storage if reconnection happens at shared storage

Reply via email to