[ 
https://issues.apache.org/jira/browse/AMQ-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299692#comment-14299692
 ] 

Arthur Naseef commented on AMQ-5549:
------------------------------------

One thing to make clear here (it seemed unclear from the title and original bug 
description).  There's a choice that must be made here - either allow for the 
possibility of neither broker being active (NFS v3 approach) or allow for the 
possibility of both being active (NFS v4) -- for some period of time.

It will be possible to reduce the amount of time the two brokers in an H/A pair 
are simultaneously active through NFS options - if the broker can detect the 
condition.  But it's not possible to eliminate the possibility of both being 
active for some time with NFS v4.

With that said, if the original active broker does not stop once it realizes 
that it lost the lock, or it cannot reliably detect a lost lock, a fix is 
welcome.  Just be aware that even with such a fix, there's a possibility of 
corruption to the KahaDB.  Documentation of various NFS settings and their 
outcomes would also be welcome.

> Shared Filesystem Master/Slave using NFSv4 allows both brokers become active 
> at the same time
> ---------------------------------------------------------------------------------------------
>
>                 Key: AMQ-5549
>                 URL: https://issues.apache.org/jira/browse/AMQ-5549
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store
>    Affects Versions: 5.10.1
>         Environment: - CentOS Linux 6
> - OpenJDK 1.7
> - ActiveMQ 5.10.1
>            Reporter: Heikki Manninen
>            Priority: Critical
>
> Identical ActiveMQ master and slave brokers are installed on CentOS Linux 6 
> virtual machines. There is a third virtual machine (also CentOS 6) providing 
> an NFSv4 share for the brokers KahaDB.
> Both brokers are started and the master broker acquires file lock on the lock 
> file and the slave broker sits in a loop and waits for a lock as expected. 
> Also changing brokers work as expected.
> Once the network connection of the NFS server is disconnected both master and 
> slave NFS mounts block and slave broker stops logging file lock re-tries. 
> After a short while after bringing the network connection back the mounts 
> come back and the slave broker is able to acquire the lock simultaneously. 
> Both brokers accept client connections.
> In this situation it is also possible to stop and start both individual 
> brokers many times and they are always able to acquire the lock even if the 
> other one is already running. Only after stopping both brokers and starting 
> them again is the situation back to normal.
> * NFS server:
> ** CentOS Linux 6
> ** NFS v4 export options: rw,sync
> ** NFS v4 grace time 45 seconds
> ** NFS v4 lease time 10 seconds
> * NFS client:
> ** CentOS Linux 6
> ** NFS mount options: nfsvers=4,proto=tcp,hard,wsize=65536,rsize=65536
> * ActiveMQ configuration (otherwise default):
> {code:xml}
>         <persistenceAdapter>
>             <kahaDB directory="${activemq.data}/kahadb">
>               <locker>
>                 <shared-file-locker lockAcquireSleepInterval="1000"/>
>               </locker>
>             </kahaDB>
>         </persistenceAdapter>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to