Hi,
  I have a resource that sometimes can take 10 minutes to start after a failure 
due to log records that need to be sync'd. (my own OCF)  I noticed while the 
start action was being performed, if other resources in my cluster report a 
"not running", no restart will be attempted until my long running started 
resource returns.  Meanwhile, the crm_mon  reports the resources as "started" 
eventhough they are not running, and may not be for many minutes.  Is the lrm 
process single threaded?  Is running my resource start action async a better 
strategy?  I am concerned that other critical resources will not be restarted 
in case of failures during the restart of the long starting one.   Is the 
resource state of started, not running or failed triggered by the result of 
start instead of monitor?

Thanks for any information.
Diane Schaefer
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

Reply via email to