Ciao,

- regarding to set the heartbeat OFF during the boot, I decide to follow your advice. In any case if I set the HP ILO (riloe) and configuring Stonith, can I play with
the
    "cib-bootstrap-options-stonith-action" value="poweroff | reboot  ?

Just to know if I spend another two cent on this issue.

- the reset for the fail count works great !!

Thanks

cristian


On Jun 23, 2009, at 5:51 PM, Cristina Bulfon wrote:

Ciao Dejan,

sorry for delay. actually I am out of the office

As soon as I came back work I will try to reset the count of fail and
repeat the test.

thanks

cristina

Dejan Muhamedagic wrote:
On Thu, Jun 18, 2009 at 10:33:32AM +0200, Cristina Bulfon wrote:

Ciao,

I am still setting the HA configurazion, we have 2 node cluster in
active/passive mode.

- to check the SAN storage device we use an external stonith device and in
case of SAN's  failure
on the active machine, it should do a shutdown instead it makes a reboot.

In the cluster property section of the cib.xml I have

           <nvpair id="cib-bootstrap-options-stonith-enabled"
name="stonith-enabled" value="true"/>
          <nvpair name="stonith-action"
id="cib-bootstrap-options-stonith-action" value="poweroff"/>


- regarding the "failback OFF"

  In the cluster property section of the cib.xml file I have

          <nvpair name="default-resource-stickiness"
id="cib-bootstrap-options-default-resource-stickiness" value="INFINITY"/>
          <nvpair
id="cib-bootstrap-options-default-resource-failure-stickiness"
name="default-resource-failure-stickiness" value="0"/>

The failback is working , when the active node is coming back the resource
still remains on the passive node , for migrating the resource
I have to execute the following command

crm_resource -M -H <active_node> -r <group> -f (it doesn't migrate
without -f )
               crm_resource -U  -H <passive_node>

If I repeat the exercise: simulate failure on the active node a couple of
times it happens that the passive_node don't take the resource.


Just found one infinitely high fail count for node
afsitfs3.roma1.infn.it.

         <nvpair
id="status-586817af-703a-4eff-ac9b-b96de063493a-fail-count- Filesystem_2"
         name="fail-count-Filesystem_2" value="INFINITY"/>

That resource can't start on that node until you reset the
failcount.

Thanks,

Dejan


In attachment you will find the ouput of cibadmin -Q

Thanks in advance for any help


cristina
















_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems



_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to