Re: [Linux-HA] failed actions of heartbeat..

Erik Dobák Wed, 19 Jan 2011 04:37:06 -0800

thank you i did not ask for exact causes just for hints as i am new to
heartbeat.


ad 1) yes i know
ad 2) i will not argue with you about this one
ad your hints) i am the only admin of those machines no one has access i
suspect now a virtual machine backup to screw up things. will disable it and
lets see.

cheers

E

ps: there is a shortage of crystal balls worldwide, somebody should do
something about this ;)


On Wed, Jan 19, 2011 at 13:27, Andrew Beekhof <[email protected]> wrote:

> On Wed, Jan 19, 2011 at 12:14 PM, Erik Dobák <[email protected]> wrote:
> > yes but why did it time out?
>
> you're asking me why your unique instance of jboss, the one none of us
> have ever seen, took too long to shutdown?
>
> > the monitor checks the jboss index page and i could access it manualy
> > without problem.
>
> 1) monitor != stop
> 2) results now have no baring on past or future results
>
> Maybe someone ran a fork-bomb at the time, or initiated a backup, or...
>
> Sorry, we don't have crystal balls.
>
> >
> > did try to crm resources cleanup and reprobe but no success.
> >
> > i restarted now both nodes and it is running fine, lets see what will
> > happen.
> >
> > E
> >
> >
> > On Wed, Jan 19, 2011 at 12:02, Andrew Beekhof <[email protected]>
> wrote:
> >
> >> On Wed, Jan 19, 2011 at 11:02 AM, Erik Dobák <[email protected]>
> wrote:
> >> > i have a cluster running on 1 node the resources are active. on the
> other
> >> > they are passive.
> >> > when i did status i got only STARTED for both resources.
> >> >
> >> > but over night seems something went wrong see below.
> >> > the strange thing is that both resources the ipaddr2 and jboss are
> >> running
> >> > correctly but heartbeat does not think so.
> >> >
> >> > any idea why?
> >>
> >> Because the monitor and stop operations "Timed Out" perhaps?
> >>
> >> >
> >> >
> >> > [root@lc-cl1 ~]# crm_mon -1
> >> > ============
> >> > Last updated: Wed Jan 19 10:37:35 2011
> >> > Stack: Heartbeat
> >> > Current DC: lc-cl2 (ecde9589-6940-49a5-a45f-a79574dfde33) - partition
> >> with
> >> > quorum
> >> > Version: 1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3
> >> > 2 Nodes configured, unknown expected votes
> >> > 1 Resources configured.
> >> > ============
> >> >
> >> > Online: [ lc-cl2 ]
> >> > OFFLINE: [ lc-cl1 ]
> >> >
> >> >  Resource Group: bamcluster
> >> >     ipaddr2    (ocf::heartbeat:IPaddr2):       Started lc-cl2 FAILED
> >> >     lcbam      (ocf::heartbeat:jboss): Started lc-cl2 (unmanaged)
> FAILED
> >> >
> >> > Failed actions:
> >> >    lcbam_stop_0 (node=lc-cl2, call=8, rc=-2, status=Timed Out):
> unknown
> >> > exec error
> >> >    ipaddr2_monitor_10000 (node=lc-cl2, call=11, rc=-2, status=Timed
> Out):
> >> > unknown exec error
> >> > _______________________________________________
> >> > Linux-HA mailing list
> >> > [email protected]
> >> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> > See also: http://linux-ha.org/ReportingProblems
> >> >
> >> _______________________________________________
> >> Linux-HA mailing list
> >> [email protected]
> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> See also: http://linux-ha.org/ReportingProblems
> >>
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] failed actions of heartbeat..

Reply via email to