[jira] [Commented] (ARTEMIS-2808) Artemis HA with shared storage strategy does not reconnect with shared storage if reconnection happens at shared storage

Francesco Nigro (Jira) Tue, 06 Oct 2020 05:57:16 -0700


    [ 
https://issues.apache.org/jira/browse/ARTEMIS-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208716#comment-17208716
 ]


Francesco Nigro commented on ARTEMIS-2808:
------------------------------------------

[~Karanbvp] Hi, I'm working on a feature that can help (probably).

First, you need to correctly configure NFS to either fail fast or not depending 
on your needs (possibly using NIO journal, fail-fast), and you can just use the 
branch on the pr mentioned on 
https://issues.apache.org/jira/browse/ARTEMIS-2918, setting 
<restart-allowed>true</restart-allowed> on broker.xml for every broker instance 
involved.

Right now I haven't yet exposed any "lingering" time before restart that could 
be helpful when remote HA file-systems with known failover time are used, hence 
it will immediately retry to restart the broker. It's still a WIP so use it 
just to check if it can help your case ;)

> Artemis HA with shared storage strategy does not reconnect with shared 
> storage if reconnection happens at shared storage
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: ARTEMIS-2808
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2808
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>    Affects Versions: 2.11.0
>         Environment: Windows 10
>            Reporter: Karan Aggarwal
>            Priority: Blocker
>         Attachments: Scenario_1.zip, Scenario_2.zip
>
>
> We verified the behavior of Artemis HA by bringing down the shared storage 
> (VM) while run is in progress and here is the observation: 
> *Scenario:*
>  * When Artemis services are up and running and run is in progress we 
> restarted the machine hosting the shared storage
>  * Shared storage was back up in 5 mins
>  * Both Artemis master and slave did not connect back to the shared storage
>  * We tried stopping the Artemis brokers. The slave stopped, but the master 
> did not stop. We had to kill the process.
>  * We tried to start the Artemis brokers. The master did not start up at all. 
> The slave started successfully.
>  * We restarted the master Artemis server. Server started successfully and 
> acquired back up.
> Shared Storage type: NFS
> Impact: The run is stopped and Artemis servers needs to be started again 
> every time shared storage connection goes down momentarily.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARTEMIS-2808) Artemis HA with shared storage strategy does not reconnect with shared storage if reconnection happens at shared storage

Reply via email to