Ciao,- regarding to set the heartbeat OFF during the boot, I decide to follow your advice. In any case if I set the HP ILO (riloe) and configuring Stonith, can I play with
the
"cib-bootstrap-options-stonith-action" value="poweroff | reboot ?Just to know if I spend another two cent on this issue. - the reset for the fail count works great !! Thanks cristian On Jun 23, 2009, at 5:51 PM, Cristina Bulfon wrote:
Ciao Dejan, sorry for delay. actually I am out of the office As soon as I came back work I will try to reset the count of fail and repeat the test. thanks cristina Dejan Muhamedagic wrote:On Thu, Jun 18, 2009 at 10:33:32AM +0200, Cristina Bulfon wrote:Ciao, I am still setting the HA configurazion, we have 2 node cluster in active/passive mode.- to check the SAN storage device we use an external stonith device and incase of SAN's failureon the active machine, it should do a shutdown instead it makes a reboot.In the cluster property section of the cib.xml I have <nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="true"/> <nvpair name="stonith-action" id="cib-bootstrap-options-stonith-action" value="poweroff"/> - regarding the "failback OFF" In the cluster property section of the cib.xml file I have <nvpair name="default-resource-stickiness"id="cib-bootstrap-options-default-resource-stickiness" value="INFINITY"/><nvpair id="cib-bootstrap-options-default-resource-failure-stickiness" name="default-resource-failure-stickiness" value="0"/>The failback is working , when the active node is coming back the resourcestill remains on the passive node , for migrating the resource I have to execute the following commandcrm_resource -M -H <active_node> -r <group> -f (it doesn't migratewithout -f ) crm_resource -U -H <passive_node>If I repeat the exercise: simulate failure on the active node a couple oftimes it happens that the passive_node don't take the resource.Just found one infinitely high fail count for node afsitfs3.roma1.infn.it. <nvpairid="status-586817af-703a-4eff-ac9b-b96de063493a-fail-count- Filesystem_2"name="fail-count-Filesystem_2" value="INFINITY"/> That resource can't start on that node until you reset the failcount. Thanks, DejanIn attachment you will find the ouput of cibadmin -Q Thanks in advance for any help cristina_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
