On Wed, Apr 4, 2018 at 9:18 AM, gbrown <gbr...@mediaocean.com> wrote:
> We had a short outage on the network and once the this came back both > instances in our master / slave setup were up and connectable. Once this > was > discovered when messages on queues were not browsable or able to be > consumed > the instances were restarted after renaming the db.data file as other > methods to start (persistenceAdapter options) would not work. > > Once started the messages on the queues were gone so probably lost. > > We use an nfs4 mount point. > > ActiveMQ Version is 5.11.1 > > so can anyone help with > > 1. How is it possible that both master and slave connected to the kahabd > It sure sounds like your NFS setup isn't successfully doing shared exclusive locks, even though it's an NFSv4 mount. http://activemq.2283324.n4.nabble.com/Unreliable-NFS-exclusive-locks-on-unreliable-networks-td4737992.html has some discussion of the NFS mount options that some other users are using, but I can't say that anyone's built a consensus around "these settings work and these other ones don't" so all you have to go on at the moment are these reports from other users. If you're able to tell us what settings you end up using that fix the problem (and you should plan on doing thorough testing, given that you've just demonstrated that your current settings appeared to work but didn't actually), maybe we can establish enough of a consensus among the community to consider documenting recommended values on the wiki. > 2. Is there anyway I could have recovered that would have kept the messages > on the queues > db.data is the index, and is simply cached information derived from the actual journal files. It can be safely deleted without data loss, because it will simply be rebuilt from the journal files. If all you deleted was that one file (which is what it sounds like) and you ended up not having messages upon restart, it means they had already been deleted from the journal files, and there wasn't anything you could have done to avoid losing the messages. If on the other hand you deleted *.log files in addition to db.data, then you could have avoided losing your messages by not deleting those journal files (*.log). I think from what you wrote that the message loss was unavoidable, unless your description of which files you deleted was incomplete. Tim