----- Original Message -----
> From: "Parshvi" <parshvi...@gmail.com>
> To: pacema...@clusterlabs.org
> Sent: Thursday, April 19, 2012 6:22:01 AM
> Subject: [Pacemaker] start/stop operations fail to happen in parallel on      
> resources
> 
> Observations:
> max-children=30
> total no. of resources=18
> 
> 1) At a default value 4 of max-children, following logs were observed
> that led to monitor op’s timeout for some resources (a total of 18
> rscs):
>   a. “max_child_count (4) reached, postponing execution of operation
>   monitor”
>   b. “WARN: perform_ra_op: the operation operation monitor[18] on
> ocf::IPaddr2::ClusterIP for client 3754, stayed in operation list for
> 14100 ms (longer than 10000 ms)”
>   c. SOLUTION: the max-children of lrmd was raised to 30.
>   d. ISSUES STILL OBSERVED: while 2-3 resources are stuck in start
>   operation,
> if a rsc is issued an explicit start command `crm resource start
> rcs1`, then the
> start op on this rsc is delayed until any one of the previous
> resources exit
> from their start operation.
> 

This is what I would expect to happen.  If a operation is in flight at the same 
time you make a configuration change, I don't believe the change will be looked 
at until the operation returns or times out.

-- Vossel

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to