Hi, we have a 3-node cluster defined, with MASTER set to SYNC and REPLICAS set to WRITE_NO_SYNC. Today we did a stress test, sending 200k+ messages on each of 3 queues. Some time during the transmission I performed a failover of Master to another node (RCO_1_FIX_VHN). The node was in 'waiting' state for about 20 seconds before it became a master.
Once the queues emptied, we noticed we lost 4 messages. Looking into qpid server log, I only see the following exception: 2019-05-28 13:20:56,581 WARN [Broker-Config] (o.a.q.s.v.b.BDBHAVirtualHostNodeImpl) - Transfer master did not complete within 100ms. Node may still be elected master at a later time. ... 2019-05-28 13:21:27,842 INFO [VirtualHostNode-RCO_1_FIX_VHN-Config] (o.a.q.s.v.SynchronousMessageStoreRecoverer) - Discarded 1 orphaned message(s). There are no other errors or issues in any logs. I'm not sure what the orphaned message is, and I'm not sure if I need to set all replicas to be SYNC in addition to the master to handle this scenar. Is there anything I can look at to track down what happened to the missing messages? Thanks!
