I assume you were using replication with your master/slave/slave setup. If that assumption is correct, then this isn't a recommended option due to the risk of split-brain which apparently you ran into. Split-brain is a scenario where two brokers are live with the same data. This can occur when using replication which is why we recommend using at least 3 master/slave pairs in a cluster to achieve a viable quorum to mitigate split-brain. Additional configuration options are discussed in the documentation [1].
That said, the best mitigation against split-brain is using shared-storage as the shared-store itself mitigates against split brain. Of course, the shared-storage device can be a single point of failure so redundancy here is recommended. > One thing I learned so far is that I MUST NOT start the live server before > the backup if both went down previously, or I lose the data that the backup > server might have received while the live has been down. That's not entirely true. When a backup starts it will make a copy of its existing data on the filesystem before synchronizing with the live and receiving a new set of data. Therefore, any data you appear to have lost should be in one of the backup journals. The number of backups the broker will keep is configured by the max-saved-replicated-journals-size setting. Justin [1] http://activemq.apache.org/components/artemis/documentation/latest/network-isolation.html On Wed, May 29, 2019 at 5:02 AM Bummer <jen...@centrum.cz> wrote: > Greetings. > > I'm implementing an EDI solution based on Artemis: EDI payloads travel > between various endpoints until all message transformations are done. > Because the data in the system is very valuable, I need to be sure that > nothing is lost in case of server crash. Also our systems are all set up > via > Ansible and Artemis servers are restarted automatically in case any of the > configuration changes. > > Yet the app/server restart thing made me experience data loss and > inconsistent cluster state. Thus I'd like know your opinions on how to > build > the cluster topology properly. > > Initially I thought that everything might be fine if I have a single live > server and two backups, each instance on a separate server. This however > ended up in having two live servers and one backup. Some of the data was on > the first live server, some on the other one. Later that day I lost it > completely as I've been trying to get back to the single live server > situation. > > What's the right approach and topology to gain the highest reliability > possible? > > One thing I learned so far is that I MUST NOT start the live server before > the backup if both went down previously, or I lose the data that the backup > server might have received while the live has been down. > > Thank you for your responses. > > > > -- > Sent from: > http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html >