On Tue, May 5, 2009 at 5:26 PM, Eliot Gable <ega...@broadvox.net> wrote:
> I have determined that this appears to be something with resource stickiness. 
> With the failure-timeout set to 5s, it was timing out the failure and 
> switching back to the preferred node so fast that it could not be migrated to 
> the other node. With failure-timeout set to 30s, the migration occurs such 
> that node2 becomes master.

right, you need failure-timeout to be greater than the time taken for
your resource to move

When the first cluster-recheck-interval fires after the
failure-timeout expires, the cluster switches back to node1 as the
Master. So, the stickiness seems to be off. I am using this
rsc_default setup, which I built based on your documentation:
>
>    <rsc_defaults>
>      <meta_attributes id="off-hours" score="2">
>        <rule id="off-hour-rule" score="2">
>          <date_expression id="four-am-to-five-am" operation="date_spec">
>            <date_spec id="off-hour-date-spec" hours="4-5" weekdays="1-7"/>
>          </date_expression>
>        </rule>
>        <nvpair id="off-stickiness" name="resource-stickiness" value="0"/>
>      </meta_attributes>
>      <meta_attributes id="core-hours" score="1">
>        <nvpair id="core-stickiness" name="resource-stickiness" 
> value="INFINITY"/>
>      </meta_attributes>
>    </rsc_defaults>
>
> The goal was to keep everything infinitely sticky except from 4am-5am when 
> resources would be allowed to switch back to their preferred node. Did I make 
> a mistake here?

It looks about right.
Can you create a bugzilla entry for this please?

_______________________________________________
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Reply via email to