On 10 October 2010 17:40, Andrew Beekhof <and...@beekhof.net> wrote:
> On Sun, Oct 10, 2010 at 12:47 AM, Pavlos Parissis
> <pavlos.paris...@gmail.com> wrote:
>> Hi,
>>
>> My resource is not started because I get this
>>
>> 00:44:27 crmd: [3141]: WARN: status_from_rc: Action 16
>> (pbx_02_monitor_0) on node-02 failed (target: 7 vs. rc: 5): Error
>>
>> but when I run manually the status I get 3, which ok because the
>> application is stopped
>>
>> [r...@node-02 ~]# /etc/init.d/znd-pbx_02 status
>> pbx_02 is stopped
>> [r...@node-02 ~]# echo $?
>> 3
>>
>> why does crm get error in this case?
>
> I imagine because when pacemaker ran it, the script didn't return 3.
>
pacemaker got 5 because the script returns 5 when the application is
not available on the system, which happens only when the fs is not
active. What actually happened in this particular case is the the
start action on fs and on the resource, which holds the application,
started on the same second. I am pretty sure that the start of the
application resource went too fast and at the time the LSB script was
executed the fs was not available, even the fs resources returned 0 on
start and on the first monitor.
This issue doesn't happen always but if I put a sleep on LSB script
for the application resource I don't run into that issue.
The resource are in group with order ip fs app.
I also removed the exit code 5 from the LSB script, it confuses the
cluster when the monitor action does place on the slave node.

Cheers,
Pavlos

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Reply via email to