[Linux-HA] How to restart cluster ?
Hello! I have pacemaker/corosync cluster with postgres. Generally it works fine, but I don't know how to perform proper shutdown of this cluster so it will come back to normal operation after start. For now, after restart I need to remove PGSQL.lock and perform crm resource cleanup msPostgresql on slave to make it work again. best regards jarek ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] How to restart cluster ?
Hello! Thank you for the answer, but this answer didn't solve my problem. I have simple two-node cluster with virtual ip address and Postgres with streaming replication, created with this tutorial: http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster I have two problems to solve: 1. I need some script, which will restart cluster on user demand. This script should stop postgres resource on both nodes and next restart them in that way, that postgres will be work without any additional operations (like removing lock files, cleaning resources etc). 2. I have a virtual model of this cluster working under VMWare. VMWare is restarted from time to time, and I have no control when master or slave will be restarted. I would like to create script, which will be called from runlevel 6 and will safely stop postgres resource. I tried to do it with: crm configure property stop-all-resources=true but after reboot I had to remove PGSQL.lock manually, and also master node has been changed. Do you have any idea how to do it ? Taktoshi MATSUO wrote: Do you use pgsql RA with Master/Slave setting ? I recommend you to stop slave node's pacemaker at first because pgsql RA removes PGSQL.lock automatically if the node is master and there is no slaves. Stop procedure 1. stop slave node - suppose nodeB 2. stop master node (PGSQL.lock file is removed) - suppose nodeA Start procedure 3. start the nodeA because it has the newest data. 4. start the nodeB If PGSQL.lock exists, the data may be inconsistent. See http://www.slideshare.net/takmatsuo/2012929-pg-study-16012253 (P36, P37) best regards Jarek ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Virtual address for slave
Hello! I'd like to have two virtual adresses: vip-master and vip-slave. vip-master should be bound to master mode, vip-slave should be bound to slave node. How can I do it ? Best regards Jarek ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Automatic Postgres recovery
Hello! I've replicated postgresql cluster with pacemaker and corosync. Testing instance of this cluster works on virtual machines, and sometimes after host restart, postgres remains unsychronized and I need to manually setup slave database with pg_basebacup. Fortunately such problem never happen to production cluster, but I can imagine, that in case of power failure similar scenario can happen to it also. Is there any way to configure it, so in case of serious slave failure, it will automatically perform recovery process with pg_basebackup ? best regards Jarek ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Antw: Automatic Postgres recovery
Hello! Dnia 2015-02-26, czw o godzinie 12:56 +0100, Ulrich Windl pisze: The obvious question is this: What do the logs say? 2015-02-26 08:33:25 CET LOG: database system was interrupted; last known up at 2014-12-13 23:00:06 CET cp: cannot stat `/var/lib/postgresql/pg_archive/0002.history': No such file or directory 2015-02-26 08:33:25 CET LOG: entering standby mode 2015-02-26 08:33:26 CET LOG: restored log file 0001000400D6 from archive 2015-02-26 08:33:26 CET LOG: invalid resource manager ID in primary checkpoint record 2015-02-26 08:33:26 CET PANIC: could not locate a valid checkpoint record 2015-02-26 08:33:26 CET LOG: startup process (PID 23454) was terminated by signal 6: Aborted 2015-02-26 08:33:26 CET LOG: aborting startup due to startup process failure 2015-02-26 08:46:10 CET LOG: database system was interrupted; last known up at 2014-12-13 23:00:06 CET cp: cannot stat `/var/lib/postgresql/pg_archive/0002.history': No such file or directory 2015-02-26 08:46:11 CET LOG: entering standby mode 2015-02-26 08:46:11 CET [local] FATAL: the database system is starting up 2015-02-26 08:46:12 CET [local] FATAL: the database system is starting up best regards Jarek ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems