On 7/18/07, Taldevkar, Chetan <[EMAIL PROTECTED]> wrote:
Message: 9
> Date: Wed, 18 Jul 2007 09:31:47 +0200
> From: "matilda matilda" <[EMAIL PROTECTED]>
> Subject: Re: [Linux-HA] WARN: unpack_rsc_op:
> To: <[email protected]>
> Message-ID: <[EMAIL PROTECTED]>
> Content-Type: text/plain; charset=US-ASCII
>
> >>> "Taldevkar, Chetan" <[EMAIL PROTECTED]> 18.07.2007 07:23
> >>>
> >Hi all,
> >
> >When I start cluster lunixha is able to invoke start call on both the
> >nodes. On first node start fails as script calls echo "stopped"
> >followed by exit 1 as this resource needs to be running on the second
> >node. After that it successfully starts on second node. But if I
> >simulate error conditions on second node it does not invoke the
script
> >on first node resulting no failover.
> >
>
> As far as I know: As soon as a RA couldn't be started on a node this
> RA is not able to start there anymore. You have to use crm_resource -C
> (--force) to reset the information that this RA couldn't be started
> there.
> So, if you like the resource to be run initially on node 2 add a
> location constraint to the resource and it will be started on the
> second node. After a failure of that resource heartbeat will start
> it on the first one if and only if the values for stickiness and
> failure stickiness are set properly.
>
> Best regards
> Andreas Mock
>
> <Chetan>
> Thanks Andreas,
>
> I executed crm_resource -C -r res_ttsvc -H failed start node. After
> which I executed crm_verify -VL but it returned with same list of
> warnings for failed start node. Is this expected? Or do I have to
> something more.
>
> </Chetan>
you should only run that command once you've repaired whatever the
problem was that caused the resource to fail in the first place.
otherwise it will just fail again
<Chetan>
Thanks Andrew,
The start on one of the node will be failed initially but once it starts
successfully on another node and if linux-ha wants to trigger start
operation on earlier failed start node it will be successful this time.
Following are the steps
1. Node A start fails, Node B start successful, status goes on
2. run crm_Resource -C for Node A (I hope executing this command on any
node should not matter) to fix the start failure on node a
3. On Node B Manually simulate resource failure which returns error in
status part of script. Linux-HA does not trigger start operation on Node
A.
Anything missing here? Why crm_resource is not cleaning the start
failure of node A?
not if you dont supply any logs :-)
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems