[
https://issues.apache.org/jira/browse/ARTEMIS-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208716#comment-17208716
]
Francesco Nigro commented on ARTEMIS-2808:
------------------------------------------
[~Karanbvp] Hi, I'm working on a feature that can help (probably).
First, you need to correctly configure NFS to either fail fast or not depending
on your needs (possibly using NIO journal, fail-fast), and you can just use the
branch on the pr mentioned on
https://issues.apache.org/jira/browse/ARTEMIS-2918, setting
<restart-allowed>true</restart-allowed> on broker.xml for every broker instance
involved.
Right now I haven't yet exposed any "lingering" time before restart that could
be helpful when remote HA file-systems with known failover time are used, hence
it will immediately retry to restart the broker. It's still a WIP so use it
just to check if it can help your case ;)
> Artemis HA with shared storage strategy does not reconnect with shared
> storage if reconnection happens at shared storage
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: ARTEMIS-2808
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2808
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Affects Versions: 2.11.0
> Environment: Windows 10
> Reporter: Karan Aggarwal
> Priority: Blocker
> Attachments: Scenario_1.zip, Scenario_2.zip
>
>
> We verified the behavior of Artemis HA by bringing down the shared storage
> (VM) while run is in progress and here is the observation:
> *Scenario:*
> * When Artemis services are up and running and run is in progress we
> restarted the machine hosting the shared storage
> * Shared storage was back up in 5 mins
> * Both Artemis master and slave did not connect back to the shared storage
> * We tried stopping the Artemis brokers. The slave stopped, but the master
> did not stop. We had to kill the process.
> * We tried to start the Artemis brokers. The master did not start up at all.
> The slave started successfully.
> * We restarted the master Artemis server. Server started successfully and
> acquired back up.
> Shared Storage type: NFS
> Impact: The run is stopped and Artemis servers needs to be started again
> every time shared storage connection goes down momentarily.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)