Dear all,
I had some problem with my resource res1. But I can't understand where
was my problem.
The information obtained from the crm_mon is
Node: node2 (237ceb38-a061-d99d-f4bf-944dd057ab5d): online
Node: node1 (965e45c6-19c4-241e-ff9d-4904882ef868): standby
resource_res1 (ocf::heartbeat:res1): Started node2 FAILED
RESOURCE2 (ocf::heartbeat:Resource): Started node2
Failed actions:
resource_res1_monitor_30000 (node=node2, call=35, rc=-2): Timed Out
The definition of res1 is:
<primitive id="resource_res1" class="ocf" type="res1"
provider="heartbeat">
<operations>
<op id="34" name="monitor" interval="30s" timeout="90s" start_delay="0s"
on_fail="restart"/>
<op id="35" name="start" timeout="30s"/>
<op id="36" name="stop" timeout="30s"/>
</operations>
<instance_attributes id="resource_res1_instance_attrs">
<attributes>
<nvpair name="target_role" id="resource_res1_target_role"
value="started"/>
</attributes>
</instance_attributes>
<meta_attributes id="resource_res1_meta">
<attributes>
<nvpair name="resource_stickiness" id="resource_res1_Rs" value="150"/>
<nvpair name="resource_failure_stickiness" id="resource_res1_FRs"
value="-100"/>
</attributes>
</meta_attributes>
</primitive>
Is it a mistake in monitor method of res1?
I wanted to repeat this situation and I included sleep (100) in monitor
method.
I received this from crm_mon:
Node: node2(237ceb38-a061-d99d-f4bf-944dd057ab5d): online
Node: node1 (965e45c6-19c4-241e-ff9d-4904882ef868): OFFLINE
RESOURCE2 (ocf::heartbeat:Resource): Started node2
Failed actions:
resource_res1_monitor_0 (node=node2, call=24, rc=-2): Timed Out
monitor_0 and monitor_30000 are not the same monitor method, right? What
monitor_30000 is?
What should I do to find where my mistake is?
--
Best regards,
Ivan Gromov.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems