> It is also advisable that it accurately report
> the service's true state after a start operation
> and mandatory for a stop.
> This is easily done by calling a level 0 check
> in a loop at the end of both functions.

Sure, but HB's business being calling RAs to perform
all those STARTs and STOPs and MONITORs, *** for the
sake of consistency,*** I would have thought it to be
HB's job to call those level 0 checks after a resource
state changing operation. 

> Calling more intensive checks is up to the RA
> writer and in the case of stops, depends on the
> chances of the level 0 check being incorrect.

Precisely my own thought. 

My understanding is that MONITOR may return only the
following status, (described in OCF RA API section
3.6.1):

0: no error, action succeeded completely

1: generic or unspecified error (current practice)
The "monitor" operation shall return this for a
crashed, hung or otherwise non-functional resource.

7: program is not running
Note: This is not the error code to be returned by a
successful "stop" operation. A successful "stop"
operation shall return 0. The "monitor" action shall
return this value only for a _cleanly_ stopped
resource. If in doubt, it should return 1.

So MONITOR returns:
0: resource is running healthy
7: resource is stopped
1: resource is broken

My confusion lies in what HB does when it receives a
status=1 after calling a RA MONITOR operation and how
do I trigger a RECOVER.

Optionally, what does HB do when it receives a
status=2 after calling a RA MONITOR operation ?

> These are not excessive requirements. 

The issue I have was I hadn't seen these requirements
documented with the same kind of detail you provide in
your reply.

Thank you very, very much for your explanations.



      
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to