[ https://issues.apache.org/jira/browse/ARTEMIS-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16683841#comment-16683841 ]
ASF GitHub Bot commented on ARTEMIS-2069: ----------------------------------------- Github user TomasHofman commented on the issue: https://github.com/apache/activemq-artemis/pull/2287 No automation? Thanks, will do. :) > Backup doesn't activate after shared store is reconnected > --------------------------------------------------------- > > Key: ARTEMIS-2069 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2069 > Project: ActiveMQ Artemis > Issue Type: Bug > Affects Versions: 2.6.2 > Reporter: Tomas Hofman > Priority: Major > > *Scenario* > # Start live backup server pair in dedicated topology with shared store HA, > with journal located on NFS > # NFS mounted on backup server fails > # Reconnect NFS on backup server > # Try to shut down live EAP server > # Backup doesn't activate > *What happens* > Backup is waiting for live to fail by checking its file lock. In case the > connection to shared storage fails, backup logs following error. > > |{color:#000000}05:50:57,896 ERROR [org.apache.activemq.artemis.core.server] > (AMQ119000: Activation for server > ActiveMQServerImpl::serverUUID=836c9b1e-f067-11e7-8763-001b21862475) > AMQ224000: Failure in initialisation: java.io.IOException: Input/output > error{color}| > |{color:#000000} at sun.nio.ch.FileDispatcherImpl.lock0(Native Method) > [rt.jar:1.8.0_151]{color}| > |{color:#000000} at > sun.nio.ch.FileDispatcherImpl.lock(FileDispatcherImpl.java:90) > [rt.jar:1.8.0_151]{color}| > |{color:#000000} at > sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1115) > [rt.jar:1.8.0_151]{color}| > |{color:#000000} at > org.apache.activemq.artemis.core.server.impl.FileLockNodeManager.tryLock(FileLockNodeManager.java:299) > [artemis-server-1.5.5.008-redhat-1.jar:1.5.5.008-redhat-1]{color}| > |{color:#000000} at > org.apache.activemq.artemis.core.server.impl.FileLockNodeManager.lock(FileLockNodeManager.java:316) > [artemis-server-1.5.5.008-redhat-1.jar:1.5.5.008-redhat-1]{color}| > |{color:#000000} at > org.apache.activemq.artemis.core.server.impl.FileLockNodeManager.awaitLiveNode(FileLockNodeManager.java:127) > [artemis-server-1.5.5.008-redhat-1.jar:1.5.5.008-redhat-1]{color}| > |{color:#000000} at > org.apache.activemq.artemis.core.server.impl.SharedStoreBackupActivation.run(SharedStoreBackupActivation.java:77) > [artemis-server-1.5.5.008-redhat-1.jar:1.5.5.008-redhat-1]{color}| > |{color:#000000} at > org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$ActivationThread.run(ActiveMQServerImpl.java:2496) > [artemis-server-1.5.5.008-redhat-1.jar:1.5.5.008-redhat-1]{color}| > | | > > Exception is caught in {{SharedStoreBackupActivation.run}}, and causes > termination of backup activation process. > In case the NFS is reconnected later, backup server doesn't continue in > activation process and it doesn't wait for live to fail. In case the live > fails, backup doesn't activate, even though it has a connection to shared > storage. > Backup should retry checking live lock even in case the storage is > unavailable. It should log warning/error messages that storage is > unavailable, but it should not terminate the activation process. This would > allow backup to continue its duties when the storage is reconnected. -- This message was sent by Atlassian JIRA (v7.6.3#76005)