Re: [Linux-HA] Xen live migration and constraints - hb_report

Andrew Beekhof Fri, 11 Dec 2009 02:28:54 -0800

On Fri, Dec 11, 2009 at 2:17 AM, infernix <[email protected]> wrote:
> Andrew Beekhof wrote:
>>
>> You have three options at this point:
>>
>> 1) Drop the ordering constraints and lower the PEs batch-limit
>
> I'd have to limit it to 1 and that doesn't seem like a smart thing to me. I
> tried to find a method to set this but i could only figure out 'pengine
> metadata' to show it; how can I change its value?


crm configure property name=value

> And wouldn't setting it to
> 1 cause a lot of problems like below with lrmd?

No.

>
>> 2) Drop the ordering constraints and lower the lrmd's concurrency
>> limit (I forget how, google or dejan might be able to help)
>
> At first glance that seemed more likely to work, because lrmd is spawning
> the Xen OCF scripts. In cluster-glue source:
>
> lrmd/lrmd.c:static int max_child_count          = 4;
>
> Looks like there is no way to set this value dynamically?

I believe there is

>
> So I rebuilt cluster-glue with this set to 1 and that seemed to work for my
> case (only Xen OCF resources and two stonith resources), the resources get
> migrated one by one (which is enough to saturate 1GBit, 2 parallel
> migrations might be perfect to keep it maxed constantly). However, I see
> hundreds of these in my logs:
>
> xen-a lrmd: [19901]: notice: max_child_count (1) reached, postponing
> execution of operation monitor[19] on ocf::MailTo::Email_Alerting for client
> 19902, its parameters: CRM_meta_interval=[10000] CRM_meta_start_delay=[0]
> CRM_meta_op_target_rc=[7] CRM_meta_timeout=[10000] email=[...@bla]
> crm_feature_set=[3.0.1] CRM_meta_name=[monitor]  by 1000 ms
>
> I'm worried that this is not a good thing, so I don't really dare to set
> max_child_count to 1.
>
>> 3) Rebuild with the following patch (I tested it and it works for your
>> setup, but I'm not yet sure that I should commit it).
>
> This really works well with the order constraints from my last post.
>
> But when I add this on top:
>
> <rsc_location id="db_prefer_xen-a" node="xen-a" rsc="db" score="5000"/>
> <rsc_location id="dbreplica_prefer_xen-b" node="xen-b" rsc="dbreplica"
> score="5000"/>
> <rsc_location id="core-101_prefer_xen-a" node="xen-a" rsc="core-101"
> score="5000"/>
> <rsc_location id="core-200_prefer_xen-a" node="xen-a" rsc="core-200"
> score="5000"/>
> <rsc_location id="edge_prefer_xen-a" node="xen-a" rsc="edge" score="5000"/>
> <rsc_location id="sysadmin_prefer_xen-b" node="xen-b" rsc="sysadmin"
> score="5000"/>
> <rsc_location id="base_prefer_xen-a" node="xen-a" rsc="base" score="5000"/>
>
> And then put a node with Xen resources on standby, it will still migrate
> multiple Xen resources in parallel to the other node, e.g. it does not honor
> the order constraints.

location constraints dont affect ordering

>
> Are these location constraints conflicting with the order constraints? I
> mean, the cluster shouldn't care where they [start|migrate_to], as long as
> they [start|migrate_to] in order, one at a time (or, if possible, a
> configurable number of parallel jobs).
>
> I have a hb_report attached for this last case.

I'll have a look
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Xen live migration and constraints - hb_report

Reply via email to