On Fri, Oct 6, 2023 at 8:46 AM Sergey Cherukhin <sergey.cheruk...@gmail.com> wrote:
> Hello! > > I used Microsoft Outlook to send this message and it was sent in the wrong > format. I'm sorry. I won't do it again. > > I use Postgresql+Pacemaker+Corosync cluster with 2 Postgresql instances in > synchronous replication mode. Parameter “rep_mode” is set to "sync", and > when I shut down the replica normal way, the primary node switches to the > async mode. But when I shut down the replica by powering it off to emulate > power unit failure, primary remains in sync mode and clients hang on INSERT > operations until "pcs resource cleanup" is performed. I created an alert > agent to run "pcs resource cleanup" when any node is lost, but this > approach doesn’t work. > > What should I do to be sure the primary node will switch to async mode if > the replica becomes lost for any cause? > One idea might be running (a) small daemon(s) colocated with the Postgresql instance(s) that uses pacemaker-tooling to check for the state of the partner-node and if it isn't there switches to async mode. You can solve this as a small custom Resource-Agent. Actually it wouldn't even be necessary to have a persistently running process - could be done in the monitoring as well. Of course you could enhance monitoring of Postgresql Resource-Agent as that it supports this switching. As this would be quite a generic change it would probably be interesting for the community as well. On the other hand I would have considered this issue so generic that it is hard to believe that there is no ready made / tested solution around already. To get it more reactive (without setting the monitoring-interval to incredibly low values) using an alert-agent (as you already tried) but maybe directly switching to async-mode might be worthwhile trying. Did you investigate what did actually go wrong when you made experiments with the alert-agent? Interesting that the resource cleanup that obviously works from the cmdline doesn't do the trick when run as alert-agent - maybe an selinux issue ... Regards, Klaus > > > Best regards, > Sergey Cherukhin > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ >
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/