Apache Dev created ARTEMIS-3030:
-----------------------------------
Summary: Journal lock evaluation fails when NFS is temporarily
disconnected
Key: ARTEMIS-3030
URL: https://issues.apache.org/jira/browse/ARTEMIS-3030
Project: ActiveMQ Artemis
Issue Type: Bug
Components: Broker
Affects Versions: 2.16.0
Reporter: Apache Dev
Same scenario of ARTEMIS-2421.
If network between Live Broker (B1) and NFS Server is disconnected (for example
rejecting its TCP packets with iptables), after the lock lease timeout this
happens:
* Backup server (B2) becomes Live
* When NFS connectivity of B1 is restored, B1 remains Live
So both broker are live.
Issue seems caused by \{{java.nio.channels.FileLock#isValid}} used in
\{{org.apache.activemq.artemis.core.server.impl.FileLockNodeManager#isLiveLockLost}},
because it is always returning true, even if in the meanwhile the lock was
lost and taken by B2.
Do you suggest to use specific mount options for NFS?
Or the lock evaluation should be replaced with a more reliable mechanism? We
notice that \{{FileLock#isValid}} is returning a cached value (true), even when
NFS connectivity is down, so it would be better to use a validation mechanism
that forces querying the NFS server.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)