Hello Andrew,

Thank you for your prompt response.
I tried your patch and it works fine!

Please backport this patch to latest Pacemaker
and Heartbeat 2.1.4.

Best Regards,
NAKAHIRA Kazutomo

Andrew Beekhof wrote:
2008/4/24 NAKAHIRA Kazutomo <[EMAIL PROTECTED]>:
hello, all

 I tried same test pattern reported by Hideo Yamauchi,
 and still automatic fail-back occurs in latest Pacemaker.
 (Pacemaker changeset: bf619298929c, Heartbeat changeset: 54723736ab18)

oh :-(
sorry, i just assumed it was the same problem

 There is a log output by PE when execute "crm_resource -C -r
 group1-dummy2 -H dl380g5e".

 (snip ha-log)
 pengine[13894]: 2008/04/24_19:02:57 info: common_apply_stickiness:
 Setting failure stickiness for group1-dummy2 on dl380g5e: 727379968
 (snip ha-log)

 It seems that if fail-count become INFINITY for any reason and
 default-resource-failure-stickiness value defined as "-INFINITY",
 then common_apply_stickiness() calculates invalid value.

Can you try the following patch?

diff -r 5229c9b520f3 lib/crm/pengine/complex.c
--- a/lib/crm/pengine/complex.c Thu Apr 24 13:20:48 2008 +0200
+++ b/lib/crm/pengine/complex.c Thu Apr 24 15:47:40 2008 +0200
@@ -372,14 +372,19 @@ common_apply_stickiness(resource_t *rsc,
        
        if(fail_count > 0 && rsc->fail_stickiness != 0) {
                resource_t *failed = rsc;
+               int score = fail_count * rsc->fail_stickiness;
                if(is_not_set(rsc->flags, pe_rsc_unique)) {
                    failed = uber_parent(rsc);
                }
-               resource_location(failed, node, fail_count * 
rsc->fail_stickiness,
-                                 "fail_stickiness", data_set);
+
+               /* detect and prevent score underflows */
+               if(rsc->fail_stickiness < 0 && (score > 0 || score < 
-INFINITY)) {
+                   score = -INFINITY;
+               }
+
+               resource_location(failed, node, score, "fail_stickiness", 
data_set);
                crm_info("Setting failure stickiness for %s on %s: %d",
-                         failed->id, node->details->uname,
-                         fail_count * rsc->fail_stickiness);
+                         failed->id, node->details->uname, score);
        }
        g_hash_table_destroy(meta_hash);
 }


 Best regards,
 NAKAHIRA Kazutomo



 HIDEO YAMAUCHI wrote:
 > Hi,
 >
 >> 2008/4/17 HIDEO YAMAUCHI <[EMAIL PROTECTED]>:
 >>> Hi,
 >>>
 >>>  I used Heartbeat-STABLE-2-1-932f11969945.
 >>>  I confirmed movement of a simple group resource.
 >>>
 >>>  1)I fail in the start movement of one resource in an Active node.
 >>>
 >>>
 >>>  2)All resources move to a Standby node.
 >>>
 >>>  3)I make the resource of the Active node clear by a crm_resource command.
 >>>   crm_resource -C -r group1-dummy1 -H rh51-pm
 >>>
 >>>  4)All the resources move to an Active node. (Automatic failback occurs.)
 >>>
 >>>  Node: rh51-pm (fe4ff160-196b-4b5f-b341-5b1ccf666bf1): online
 >>>  Node: rh51-pm2 (19ca6bf8-a6a0-4207-ad1f-bd4ed22ebcd4): online
 >>>
 >>>  Resource Group: resource_group1
 >>>     group1-dummy1       (ocf::heartbeat:Dummy): Started rh51-pm
 >>>     group1-dummy2       (ocf::heartbeat:Dummy2):        Started rh51-pm
 >>>
 >>>
 >>>  I think that the failback did not work in Ver2.1.3. (at case 4)
 >>>
 >>>  Is this new specifications from Ver2.1.4?
 >> No it was a bug that I fixed a few days back - I guess the fix hasn't
 >> been backported yet
 >
 > OK.
 >
 > I wait for the revision of the bug to be reflected.
 >
 > Thanks,
 >
 > Hideo Yamauchi.
 >
 >>>  And, is there the setting method that does not  failback in the same way 
as Ver2.1.3?
 >> _______________________________________________________
 >> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 >> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 >> Home Page: http://linux-ha.org/
 >>
 >
 > _______________________________________________________
 > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 > Home Page: http://linux-ha.org/


 --
 ----------------------------------------
 NAKAHIRA Kazutomo
 NTT DATA INTELLILINK CORPORATION

_______________________________________________________
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/


_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


--
----------------------------------------
NAKAHIRA Kazutomo
NTT DATA INTELLILINK CORPORATION
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to