On Thu, Nov 04, 2010 at 02:54:59PM -0300, mike wrote: > On 10-11-04 12:38 PM, Dejan Muhamedagic wrote: > > Hi, > > > > On Thu, Nov 04, 2010 at 11:06:48AM -0300, mike wrote: > > > >> Looking for a more experienced person who can explain this issue we had > >> last night. > >> > >> Our backups kicked in during the night at 1AM. At 1:01AM, our mysql > >> cluster had issues. Specifically I can see in crm_mon where the cluster > >> has it as failed due to an "unknown exec error". Looking at the > >> performance of the node, I can see where wait on I/O went through the > >> roof at 1AM when the tsm backups kicked in. I can see where this caused > >> heartbeat issues because mysql was late checking its instances - it > >> generally takes a few seconds but in this case it took 3 minutes. Of > >> course this is all due to the extremely high wait on I/O but I am > >> curious - why didn't the cluster fail over? Why put MySQL in an > >> unmanaged state and simply say there was an "unknown exec error?". > >> > > Can't say without looking at the logs and the PE files. One > > possible explanation is that a resource was for whatever reason > > not allowed to run on the other node: a failure in the past > > which didn't expire or a negative location constraint. Or the > > fail count reached migration threshold (if defined). > > > > Thanks, > > > > Dejan > > > > > > > >> Thanks for any comments > > > > > Thanks for the reply Dejan. I have the failcount threshold set to 3 on > both nodes and if I understand it correctly, after a 3rd failure it > should fail over to then backup node. Correct?
Yes. > What do you mean by a > negative location constraint? A location constraint with a negative score. For instance, such constraint is inserted by the "crm resource move" command. Thanks, Dejan > Mike > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
