On Fri, Nov 14, 2014 at 4:33 PM, Dmitry Matveichev <d.matveic...@mfisoft.ru> wrote: > We've already tried to set it but it didn't help. >
I doubt it is possible to say anything without logs. > ------------------------ > Kind regards, > Dmitriy Matveichev. > > > -----Original Message----- > From: Andrei Borzenkov [mailto:arvidj...@gmail.com] > Sent: Friday, November 14, 2014 4:12 PM > To: The Pacemaker cluster resource manager > Subject: Re: [Pacemaker] Long failover > > On Fri, Nov 14, 2014 at 2:57 PM, Dmitry Matveichev <d.matveic...@mfisoft.ru> > wrote: >> Hello, >> >> >> >> We have a cluster configured via pacemaker+corosync+crm. The >> configuration >> is: >> >> >> >> node master >> >> node slave >> >> primitive HA-VIP1 IPaddr2 \ >> >> params ip=192.168.22.71 nic=bond0 \ >> >> op monitor interval=1s >> >> primitive HA-variator lsb: variator \ >> >> op monitor interval=1s \ >> >> meta migration-threshold=1 failure-timeout=1s >> >> group HA-Group HA-VIP1 HA-variator >> >> property cib-bootstrap-options: \ >> >> dc-version=1.1.10-14.el6-368c726 \ >> >> cluster-infrastructure="classic openais (with plugin)" \ >> >> expected-quorum-votes=2 \ >> >> stonith-enabled=false \ >> >> no-quorum-policy=ignore \ >> >> last-lrm-refresh=1383871087 >> >> rsc_defaults rsc-options: \ >> >> resource-stickiness=100 >> >> >> >> Firstly I make the variator service down on the master node (actually >> I delete the service binary and kill the variator process, so the >> variator fails to restart). Resources very quickly move on the slave >> node as expected. Then I return the binary on the master and restart >> the variator service. Now I make the same stuff with binary and service on >> slave node. >> The crm status command quickly shows me HA-variator (lsb: variator): >> Stopped. But it take to much time (for us) before recourses are switched on >> the master node (around 1 min). Then line >> >> Failed actions: >> >> HA- variator _monitor_1000 on slave 'unknown error' (1): call=-1, >> status=Timed Out, last-rc-change='Sat Dec 21 03:59:45 2013', >> queued=0ms, exec=0ms >> >> appears in the crm status and recourses are switched. >> >> >> >> What is that timeout? Where I can change it? >> > > This is operation timeout. You can change it in operation definition: > op monitor interval=1s timeout=5s > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org