I am unsure if this should actually be filed as a bug, at least with the scope
that I described. I’ve tested this on the good cluster. The results so far are:
When node 01 is both the rabbitmq-master and active resource manager, and the
VM is paused, everything goes down.
When node 01 is the
It sounds like you may be experiencing issue https://pulp.plan.io/issues/
3135
>From our conversation on IRC, I learned that the hypervisor is acting up
and the VMs pause from time to time. So even though the system is not under
heavy load it still behaves as though it is. As a result the
Update: We seem to have found the issue. Infrastructure told me that there is
an issue that can pause the VMs anywhere from nanoseconds to seconds, possibly
hundreds of times with only splitseconds between the pauses. Thus, if the
active manager pauses, a standby takes over. The paused manager
Hello everyone.
I have two pulp clusters, each containing three nodes, all systems are up to
date (pulp 2.14.3). However, the cluster behavior differs greatly. Let's call
the working cluster the external one, and the broken one internal.
The setup: Everything is virtualized. Both clusters are