Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?

2014-03-11 Thread Vladislav Bogdanov
07.03.2014 10:30, Vladislav Bogdanov wrote: 07.03.2014 05:43, Andrew Beekhof wrote: On 6 Mar 2014, at 10:39 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 18.02.2014 03:49, Andrew Beekhof wrote: On 31 Jan 2014, at 6:20 pm, yusuke iida yusk.i...@gmail.com wrote: Hi, all I measure the

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?

2014-03-11 Thread Yusuke Iida
Hi, Andrew 2014-03-11 14:21 GMT+09:00 Andrew Beekhof and...@beekhof.net: On 11 Mar 2014, at 4:14 pm, Andrew Beekhof and...@beekhof.net wrote: [snip] If I do this however: # cp start.xml 1.xml; tools/cibadmin --replace -o configuration --xml-file replace.some -V I start to see

Re: [Pacemaker] Pacemaker/corosync freeze

2014-03-11 Thread Attila Megyeri
-Original Message- From: Andrew Beekhof [mailto:and...@beekhof.net] Sent: Tuesday, March 11, 2014 12:48 AM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] Pacemaker/corosync freeze On 7 Mar 2014, at 5:54 pm, Attila Megyeri amegy...@minerva-soft.com wrote:

Re: [Pacemaker] Pacemaker/corosync freeze

2014-03-11 Thread Andrew Beekhof
On 12 Mar 2014, at 1:54 am, Attila Megyeri amegy...@minerva-soft.com wrote: -Original Message- From: Andrew Beekhof [mailto:and...@beekhof.net] Sent: Tuesday, March 11, 2014 12:48 AM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] Pacemaker/corosync freeze

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?

2014-03-11 Thread Andrew Beekhof
On 11 Mar 2014, at 6:51 pm, Yusuke Iida yusk.i...@gmail.com wrote: Hi, Andrew 2014-03-11 14:21 GMT+09:00 Andrew Beekhof and...@beekhof.net: On 11 Mar 2014, at 4:14 pm, Andrew Beekhof and...@beekhof.net wrote: [snip] If I do this however: # cp start.xml 1.xml; tools/cibadmin

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?

2014-03-11 Thread Andrew Beekhof
On 11 Mar 2014, at 6:23 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 07.03.2014 10:30, Vladislav Bogdanov wrote: 07.03.2014 05:43, Andrew Beekhof wrote: On 6 Mar 2014, at 10:39 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 18.02.2014 03:49, Andrew Beekhof wrote: On 31 Jan

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?

2014-03-11 Thread Andrew Beekhof
On 12 Mar 2014, at 8:40 am, Andrew Beekhof and...@beekhof.net wrote: On 11 Mar 2014, at 6:23 pm, Vladislav Bogdanov bub...@hoster-ok.com wrote: 07.03.2014 10:30, Vladislav Bogdanov wrote: 07.03.2014 05:43, Andrew Beekhof wrote: On 6 Mar 2014, at 10:39 pm, Vladislav Bogdanov

Re: [Pacemaker] hangs pending

2014-03-11 Thread Andrew Beekhof
Sorry for the delay, sometimes it takes a while to rebuild the necessary context On 5 Mar 2014, at 4:42 pm, Andrey Groshev gre...@yandex.ru wrote: 05.03.2014, 04:04, Andrew Beekhof and...@beekhof.net: On 25 Feb 2014, at 8:30 pm, Andrey Groshev gre...@yandex.ru wrote: 21.02.2014, 12:04,

Re: [Pacemaker] pacemaker with cman and dbrd when primary node panics or poweroff

2014-03-11 Thread Andrew Beekhof
On 8 Mar 2014, at 11:31 am, Gianluca Cecchi gianluca.cec...@gmail.com wrote: I provoke power off of ovirteng01. Fencing agent works ok on ovirteng02 and reboots it. I stop boot ofovirteng01 at grub prompt to simulate problem in boot (for example system put in console mode due to filesystem

Re: [Pacemaker] pacemaker with cman and dbrd when primary node panics or poweroff

2014-03-11 Thread Gianluca Cecchi
On Tue, Mar 11, 2014 at 11:52 PM, Andrew Beekhof and...@beekhof.net wrote: On 8 Mar 2014, at 11:31 am, Gianluca Cecchi gianluca.cec...@gmail.com wrote: I provoke power off of ovirteng01. Fencing agent works ok on ovirteng02 and reboots it. I stop boot ofovirteng01 at grub prompt to simulate

Re: [Pacemaker] pacemaker with cman and dbrd when primary node panics or poweroff

2014-03-11 Thread Gianluca Cecchi
On Wed, Mar 12, 2014 at 12:37 AM, Andrew Beekhof and...@beekhof.net wrote: It was put in when drbd called: fence-peer /usr/lib/drbd/crm-fence-peer.sh; When and why it called that is not my area of expertise though. The constraint put by crm-fence-peer.sh was rsc_location rsc=ms_OvirtData

Re: [Pacemaker] pacemaker with cman and dbrd when primary node panics or poweroff

2014-03-11 Thread Andrew Beekhof
On 12 Mar 2014, at 10:32 am, Gianluca Cecchi gianluca.cec...@gmail.com wrote: On Tue, Mar 11, 2014 at 11:52 PM, Andrew Beekhof and...@beekhof.net wrote: On 8 Mar 2014, at 11:31 am, Gianluca Cecchi gianluca.cec...@gmail.com wrote: I provoke power off of ovirteng01. Fencing agent works ok

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?

2014-03-11 Thread Yusuke Iida
Hi, Andrew 2014-03-12 6:37 GMT+09:00 Andrew Beekhof and...@beekhof.net: Mar 07 13:24:14 [2528] vm01 crmd: (te_callbacks:493 ) error: te_update_diff: Ingoring create operation for /cib 0xf91c10, configuration Thats interesting... is that with the fixes mentioned above? I'm sorry. The

Re: [Pacemaker] What is the reason which the node in which failure has not occurred carries out lost?

2014-03-11 Thread Vladislav Bogdanov
12.03.2014 00:37, Andrew Beekhof wrote: ... I'm somewhat confused at this point if crmsh is using --replace, then why is it doing diff calculations? Or are replace operations only for the load operation? It uses on of two methods depending on pacemaker version.

Re: [Pacemaker] pacemaker with cman and dbrd when primary node panics or poweroff

2014-03-11 Thread Andrew Beekhof
On 12 Mar 2014, at 10:56 am, Gianluca Cecchi gianluca.cec...@gmail.com wrote: On Wed, Mar 12, 2014 at 12:37 AM, Andrew Beekhof and...@beekhof.net wrote: It was put in when drbd called: fence-peer /usr/lib/drbd/crm-fence-peer.sh; When and why it called that is not my area of expertise