[ClusterLabs] Antw: Re: Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

Ulrich Windl Sun, 10 Oct 2021 23:09:46 -0700

>>> damiano giuliani <damianogiulian...@gmail.com> schrieb am 08.10.2021 um 
>>> 15:00
in Nachricht
<CAG=zyno0ieawqearuzh2cdmy-6kzf3dhbbubr0iiurf47bg...@mail.gmail.com>:
> Hi Guys, after months of suddens  unexpected failovers, checking every
> corners and types of logs without any luck, cuz no logs and no reasons or


If you have no logs, you should cleaerly check your configuration.

...
> So it turn out that a lil bit of swap was used and i suspect corosync
> process were swapped to disks creating lag where 1s default corosync
> timeout was not enough.

BTW: Do you use thing provisioned swap (just in case)?

> So it is, swap doesnt log anything and moving process to allocated ram to
> swap take times more that 1s default timeout (probably many many mores).

When swapping to/from SSD, it's hard to believe that it takes so long that the 
cluster nodes would be fenced.
Also code that is periodically referenced won't be swapped, specificall if you 
have plenty of RAM.

> i fix it changing the swappiness of each servers to 10 (at minimum)
> avoinding the corosync process could swap.

Do you have a proof that swap was the problem?

...

Regards,
Ulrich



_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Antw: Re: Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

Reply via email to