Re: [ClusterLabs] large cluster - failure recovery

2015-11-04 Thread Ken Gaillot
On 11/04/2015 12:55 PM, Digimer wrote: > On 04/11/15 01:50 PM, Radoslaw Garbacz wrote: >> Hi, >> >> I have a cluster of 32 nodes, and after some tuning was able to have it >> started and running, > > This is not supported by RH for a reasons; it's hard to get the timing > right. SUSE supports up

Re: [ClusterLabs] large cluster - failure recovery

2015-11-04 Thread Radoslaw Garbacz
Thank you Ken and Digimer for all your suggestions. On Wed, Nov 4, 2015 at 2:32 PM, Ken Gaillot wrote: > On 11/04/2015 12:55 PM, Digimer wrote: > > On 04/11/15 01:50 PM, Radoslaw Garbacz wrote: > >> Hi, > >> > >> I have a cluster of 32 nodes, and after some tuning was able

[ClusterLabs] Antw: Re: [Question] Question about mysql RA.

2015-11-04 Thread Ulrich Windl
>>> Ken Gaillot schrieb am 04.11.2015 um 16:44 in >>> Nachricht <563a27c2.5090...@redhat.com>: > On 11/04/2015 04:36 AM, renayama19661...@ybb.ne.jp wrote: [...] >> pid=`cat $OCF_RESKEY_pid 2> /dev/null ` >> /bin/kill $pid > /dev/null > > I think before this line,

[ClusterLabs] how to fence in a two node cluster.If haretbeat network is down between the two nodes, which node will fence the other node?

2015-11-04 Thread Shilu
[cid:image002.png@01D1091A.AEFC8740] In HA, if a node is declared dead, it needs to be fenced/stonith'ed before its services are recovered. Not doing this can lead to a split-brain. If haretbeat network is down between the two nodes,which node will fence the other node? If two nodes were

Re: [ClusterLabs] how to fence in a two node cluster.If haretbeat network is down between the two nodes, which node will fence the other node?

2015-11-04 Thread Digimer
On 05/11/15 02:09 AM, Shilu wrote: > cid:image002.png@01D1091A.AEFC8740 > > In HA, if a node is declared dead, it needs to be fenced/stonith'ed > before its services are recovered. Not doing this can lead to a split-brain. > > If haretbeat network is down between the two nodes,which node will

Re: [ClusterLabs] large cluster - failure recovery

2015-11-04 Thread Digimer
On 04/11/15 01:50 PM, Radoslaw Garbacz wrote: > Hi, > > I have a cluster of 32 nodes, and after some tuning was able to have it > started and running, This is not supported by RH for a reasons; it's hard to get the timing right. SUSE supports up to 32 nodes, but they must be doing some serious

Re: [ClusterLabs] பதில்: Re: crm_mon memory leak

2015-11-04 Thread Karthikeyan Ramasamy
Hi Ken, Both CPU and this SNMP notification were interrelated, as the number of retries was infinite. As a service was down, for every retry it was trying to send a notification as well and that stalled Pacemaker. Thanks, Karthik. -Original Message- From: Karthikeyan Ramasamy Sent:

Re: [ClusterLabs] Pacemaker build error

2015-11-04 Thread Ken Gaillot
On 11/03/2015 11:10 PM, Jim Van Oosten wrote: > > > I am getting a compile error when building Pacemaker on Linux version > 2.6.32-431.el6.x86_64. > > The build commands: > > git clone git://github.com/ClusterLabs/pacemaker.git > cd pacemaker > ./autogen.sh && ./configure --prefix=/usr

[ClusterLabs] large cluster - failure recovery

2015-11-04 Thread Radoslaw Garbacz
Hi, I have a cluster of 32 nodes, and after some tuning was able to have it started and running, but it does not recover from a node disconnect-connect failure. It regains quorum, but CIB does not recover to a synchronized state and "cibadmin -Q" times out. Is there anything with corosync or

Re: [ClusterLabs] Pacemaker build error

2015-11-04 Thread Ken Gaillot
On 11/04/2015 09:31 AM, Ken Gaillot wrote: > On 11/03/2015 11:10 PM, Jim Van Oosten wrote: >> >> >> I am getting a compile error when building Pacemaker on Linux version >> 2.6.32-431.el6.x86_64. >> >> The build commands: >> >> git clone git://github.com/ClusterLabs/pacemaker.git >> cd pacemaker

[ClusterLabs] [Question] Question about mysql RA.

2015-11-04 Thread renayama19661014
Hi All, I contributed a patch several times about mysql, too. I did not mind it very much before, but mysql RA makes next move. Step1) Constitute a cluster using mysql in Pacemaker. Step2) The mysql process kill by signal SIGKILL. Step3) Stop Pacemaker before monitor error occurs and stop mysql