On Fri, Dec 11, 2009 at 2:17 AM, infernix <[email protected]> wrote: > Andrew Beekhof wrote: >> >> You have three options at this point: >> >> 1) Drop the ordering constraints and lower the PEs batch-limit > > I'd have to limit it to 1 and that doesn't seem like a smart thing to me. I > tried to find a method to set this but i could only figure out 'pengine > metadata' to show it; how can I change its value?
crm configure property name=value > And wouldn't setting it to > 1 cause a lot of problems like below with lrmd? No. > >> 2) Drop the ordering constraints and lower the lrmd's concurrency >> limit (I forget how, google or dejan might be able to help) > > At first glance that seemed more likely to work, because lrmd is spawning > the Xen OCF scripts. In cluster-glue source: > > lrmd/lrmd.c:static int max_child_count = 4; > > Looks like there is no way to set this value dynamically? I believe there is > > So I rebuilt cluster-glue with this set to 1 and that seemed to work for my > case (only Xen OCF resources and two stonith resources), the resources get > migrated one by one (which is enough to saturate 1GBit, 2 parallel > migrations might be perfect to keep it maxed constantly). However, I see > hundreds of these in my logs: > > xen-a lrmd: [19901]: notice: max_child_count (1) reached, postponing > execution of operation monitor[19] on ocf::MailTo::Email_Alerting for client > 19902, its parameters: CRM_meta_interval=[10000] CRM_meta_start_delay=[0] > CRM_meta_op_target_rc=[7] CRM_meta_timeout=[10000] email=[...@bla] > crm_feature_set=[3.0.1] CRM_meta_name=[monitor] by 1000 ms > > I'm worried that this is not a good thing, so I don't really dare to set > max_child_count to 1. > >> 3) Rebuild with the following patch (I tested it and it works for your >> setup, but I'm not yet sure that I should commit it). > > This really works well with the order constraints from my last post. > > But when I add this on top: > > <rsc_location id="db_prefer_xen-a" node="xen-a" rsc="db" score="5000"/> > <rsc_location id="dbreplica_prefer_xen-b" node="xen-b" rsc="dbreplica" > score="5000"/> > <rsc_location id="core-101_prefer_xen-a" node="xen-a" rsc="core-101" > score="5000"/> > <rsc_location id="core-200_prefer_xen-a" node="xen-a" rsc="core-200" > score="5000"/> > <rsc_location id="edge_prefer_xen-a" node="xen-a" rsc="edge" score="5000"/> > <rsc_location id="sysadmin_prefer_xen-b" node="xen-b" rsc="sysadmin" > score="5000"/> > <rsc_location id="base_prefer_xen-a" node="xen-a" rsc="base" score="5000"/> > > And then put a node with Xen resources on standby, it will still migrate > multiple Xen resources in parallel to the other node, e.g. it does not honor > the order constraints. location constraints dont affect ordering > > Are these location constraints conflicting with the order constraints? I > mean, the cluster shouldn't care where they [start|migrate_to], as long as > they [start|migrate_to] in order, one at a time (or, if possible, a > configurable number of parallel jobs). > > I have a hb_report attached for this last case. I'll have a look _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
