Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-26 Thread Angel L. Mateo
El 25/03/13 20:50, Jacek Konieczny escribió: On Mon, 25 Mar 2013 20:01:28 +0100 Angel L. Mateo ama...@um.es wrote: quorum { provider: corosync_votequorum expected_votes: 2 two_node: 1 } Corosync will then manage quorum for the two-node cluster and Pacemaker I'm

Re: [Pacemaker] DRBD+LVM+NFS problems

2013-03-26 Thread emmanuel segura
Hello Dennis This constrain is wrong colocation c_web1_on_drbd inf: ms_drbd_web1:Master p_fs_web1 it should be colocation c_web1_on_drbd inf: p_fs_web1 ms_drbd_web1:Master Thanks 2013/3/26 Dennis Jacobfeuerborn denni...@conversis.de I have now reduced the configuration further and removed

Re: [Pacemaker] OCF Resource agent promote question

2013-03-26 Thread Rainer Brestan
Hi Steve, when Pacemaker does promotion, it has already selected a specific node to become master. It is far too late in this state to try to update master scores. But there is another problem with xlog in PostgreSQL. According to some discussion on PostgreSQL mailing lists, not relevant

Re: [Pacemaker] OCF Resource agent promote question

2013-03-26 Thread Steven Bambling
I'm guessing that you are referring to this RA https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/pgsql with additions by T. Matsuo. From reading the wiki ( hopefully I have misinterpreted this :) ) on his Github page it looks like this RA was written to work in a

Re: [Pacemaker] OCF Resource agent promote question

2013-03-26 Thread Steven Bambling
On Mar 26, 2013, at 6:32 AM, Rainer Brestan rainer.bres...@gmx.netmailto:rainer.bres...@gmx.net wrote: Hi Steve, when Pacemaker does promotion, it has already selected a specific node to become master. It is far too late in this state to try to update master scores. But there is another

Re: [Pacemaker] OCF Resource agent promote question

2013-03-26 Thread Rainer Brestan
Hi Steve, pgsql RA does the same, it compares the last_xlog_replay_location of all nodes for master promotion. Doing a promote as a restart instead of promote command to conserve timeline id is also on configurable option (restart_on_promote) of the current RA. And the RA is definitely

Re: [Pacemaker] OCF Resource agent promote question

2013-03-26 Thread Steven Bambling
Excellent thanks so much for the clarification. I'll drop this new RA in and see if I can get things working. STEVE On Mar 26, 2013, at 7:38 AM, Rainer Brestan rainer.bres...@gmx.netmailto:rainer.bres...@gmx.net wrote: Hi Steve, pgsql RA does the same, it compares the

[Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread emmanuel segura
Hello List How can i change the sbd watchdog timeout without stopping the cluster? Thanks -- esta es mi vida e me la vivo hasta que dios quiera ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread Lars Marowsky-Bree
On 2013-03-26T15:56:48, emmanuel segura emi2f...@gmail.com wrote: Hello List How can i change the sbd watchdog timeout without stopping the cluster? Very, very carefully. Stop the external/sbd resource, so that fencing blocks while you're doing this. You can then manually stop the sbd

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread emmanuel segura
Hello Lars So the procedura should be: crm resource stop stonith_sbd sbd -d /dev/sda1 message exit = (on every node) sbd -d /dev/sda1 -1 90 -4 180 create crm resource start stonith_sbd Thanks 2013/3/26 Lars Marowsky-Bree l...@suse.com On 2013-03-26T15:56:48, emmanuel segura

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread Lars Marowsky-Bree
On 2013-03-26T16:48:30, emmanuel segura emi2f...@gmail.com wrote: Hello Lars So the procedura should be: crm resource stop stonith_sbd sbd -d /dev/sda1 message exit = (on every node) sbd -d /dev/sda1 -1 90 -4 180 create crm resource start stonith_sbd Yes. But I wonder why you need such

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread emmanuel segura
Hello Lars Because we have a vm(suse 11) cluster on a esx cluster, as datastore we are using a netapp in cluster, the last night we had a netapp failover, no problem with other vm servers, but all vm in cluster with pacemaker+sbd get has rebooted This beacuse the watchdog time is 5 seconds

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread emmanuel segura
Hello Lars Why do you think the long timeout is wrong? Do i need to change the stonith-timeout on pacemaker? Thanks 2013/3/26 Lars Marowsky-Bree l...@suse.com On 2013-03-26T16:48:30, emmanuel segura emi2f...@gmail.com wrote: Hello Lars So the procedura should be: crm resource stop

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread Lars Marowsky-Bree
On 2013-03-26T17:13:34, emmanuel segura emi2f...@gmail.com wrote: Hello Lars Because we have a vm(suse 11) cluster on a esx cluster, as datastore we are using a netapp in cluster, the last night we had a netapp failover, no problem with other vm servers, but all vm in cluster with

Re: [Pacemaker] change sbd watchdog timeout in a running cluster

2013-03-26 Thread emmanuel segura
Hello Lars what timeout you recommend me Thanks a lot 2013/3/26 Lars Marowsky-Bree l...@suse.com On 2013-03-26T17:13:34, emmanuel segura emi2f...@gmail.com wrote: Hello Lars Because we have a vm(suse 11) cluster on a esx cluster, as datastore we are using a netapp in cluster, the

Re: [Pacemaker] DRBD+LVM+NFS problems

2013-03-26 Thread Dennis Jacobfeuerborn
On 26.03.2013 06:14, Vladislav Bogdanov wrote: 26.03.2013 04:23, Dennis Jacobfeuerborn wrote: I have now reduced the configuration further and removed LVM from the picture. Still the cluster fails when I set the master node to standby. What's interesting is that things get fixed when I issue a

Re: [Pacemaker] Linking lib/cib and lib/pengine to each other?

2013-03-26 Thread Andrew Beekhof
On Mon, Mar 25, 2013 at 10:55 PM, Viacheslav Dubrovskyi dub...@gmail.com wrote: 23.03.2013 08:27, Viacheslav Dubrovskyi пишет: Hi. I'm building a package for my distributive. Everything is built, but the package does not pass our internal tests. I get errors like this: verify-elf: ERROR:

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-26 Thread Andrew Beekhof
On Tue, Mar 26, 2013 at 6:30 PM, Angel L. Mateo ama...@um.es wrote: El 25/03/13 20:50, Jacek Konieczny escribió: On Mon, 25 Mar 2013 20:01:28 +0100 Angel L. Mateo ama...@um.es wrote: quorum { provider: corosync_votequorum expected_votes: 2 two_node: 1 } Corosync

Re: [Pacemaker] DRBD+LVM+NFS problems

2013-03-26 Thread Vladislav Bogdanov
Dennis Jacobfeuerborn denni...@conversis.de wrote: On 26.03.2013 06:14, Vladislav Bogdanov wrote: 26.03.2013 04:23, Dennis Jacobfeuerborn wrote: I have now reduced the configuration further and removed LVM from the picture. Still the cluster fails when I set the master node to standby. What's

Re: [Pacemaker] Linking lib/cib and lib/pengine to each other?

2013-03-26 Thread Viacheslav Dubrovskyi
26.03.2013 19:41, Andrew Beekhof пишет: Hi. I'm building a package for my distributive. Everything is built, but the package does not pass our internal tests. I get errors like this: verify-elf: ERROR: ./usr/lib/libpe_status.so.4.1.0: undefined symbol: get_object_root Was this the only

[Pacemaker] Problem on creating CIB entry in CRM - shadow cannot be created

2013-03-26 Thread Donna Livingstone
We are attempting to move our rhel 6.3 pacemaker/drbd environment to a rhel 6.4 pacemaker environment and as you can see below we cannot create a shadow CIB. crm_shadow -w also core dumps. On 6.3 everything works. Versions are given below. [root@vccstest1 ~]# crm crm(live)# cib new ills

Re: [Pacemaker] Linking lib/cib and lib/pengine to each other?

2013-03-26 Thread Andrew Beekhof
Give https://github.com/beekhof/pacemaker/commit/53c9122 a try On Wed, Mar 27, 2013 at 7:43 AM, Viacheslav Dubrovskyi dub...@gmail.com wrote: 26.03.2013 19:41, Andrew Beekhof пишет: Hi. I'm building a package for my distributive. Everything is built, but the package does not pass our

Re: [Pacemaker] Problem on creating CIB entry in CRM - shadow cannot be created

2013-03-26 Thread Andrew Beekhof
On Wed, Mar 27, 2013 at 8:00 AM, Donna Livingstone donna.livingst...@shaw.ca wrote: We are attempting to move our rhel 6.3 pacemaker/drbd environment to a rhel 6.4 pacemaker environment and as you can see below we cannot create a shadow CIB. crm_shadow -w also core dumps. On 6.3 everything