>> It suggests that the RA implementation of the
'START'
>> and 'STOP' operations, should include the code as
to
>> perform, before exiting, ALL the tests that are
>> carried out at ALL implemented check-levels of
>> monitoring, as to reliably return a resource's
>> status,
> It just relies on the RA properly starting or
> stoping the resource. It's up to the RA to the
> its job right.
Sure, but in order to determine the status of a
resource, and declare that resource as
"running-healthy", the expected outcome of a START
operation, it requires the RA to perform *as part of*
the START operation, the best status verification the
RA can do, that is, what it does exactly under the
most detailed monitoring it can perform, namely a
"MONITOR check-level 20" ?
And, if the best status verification (MONITOR) has to
*always* be performed, what is then the point of
having *lesser* performing status verification, all
regardless of the time it takes for these status
verifications to complete ?
>> This means that repairable states of a resource
>> should be tested for and repaired at *all*
>> check-levels instead of expecting Heartbeat to
>> gradually shift from check-level 0 to
>> check-level 20.
> Er, what's a "repairable state"?
> And who's to repair a resource?
A state where "it is advantageous to use" a "recover"
function, provided and advertised by the RA, when
compared to a stop/start operation.
OCF RA API, section 3.4.4 "recover"
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems