On Thu, Nov 04, 2010 at 02:54:59PM -0300, mike wrote:
> On 10-11-04 12:38 PM, Dejan Muhamedagic wrote:
> > Hi,
> >
> > On Thu, Nov 04, 2010 at 11:06:48AM -0300, mike wrote:
> >    
> >> Looking for a more experienced person who can explain this issue we had
> >> last night.
> >>
> >> Our backups kicked in during the night at 1AM. At 1:01AM, our mysql
> >> cluster had issues. Specifically I can see in crm_mon where the cluster
> >> has it as failed due to an "unknown exec error". Looking at the
> >> performance of the node, I can see where wait on I/O went through the
> >> roof at 1AM when the tsm backups kicked in. I can see where this caused
> >> heartbeat issues because mysql was late checking its instances - it
> >> generally takes a few seconds but in this case it took 3 minutes. Of
> >> course this is all due to the extremely high wait on I/O but I am
> >> curious - why didn't the cluster fail over? Why put MySQL in an
> >> unmanaged state and simply say there was an "unknown exec error?".
> >>      
> > Can't say without looking at the logs and the PE files. One
> > possible explanation is that a resource was for whatever reason
> > not allowed to run on the other node: a failure in the past
> > which didn't expire or a negative location constraint. Or the
> > fail count reached migration threshold (if defined).
> >
> > Thanks,
> >
> > Dejan
> >
> >
> >    
> >> Thanks for any comments
> >
> >    
> Thanks for the reply Dejan. I have the failcount threshold set to 3 on 
> both nodes and if I understand it correctly, after a 3rd failure it 
> should fail over to then backup node. Correct?

Yes.

> What do you mean by a 
> negative location constraint?

A location constraint with a negative score. For instance, such
constraint is inserted by the "crm resource move" command.

Thanks,

Dejan

> Mike
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to