I was led to believe for a hot cluster (no stonith using drbd backed nfs resources) that the best way to ensure failover was via a quorem with a third (resource hosting or not ) server. I had issues with getting a robust fail over and am moving on to a 3 node drbd9 backed cluster.
Sent from my Verizon Wireless 4G LTE DROID JR <[email protected]> wrote: >Greetings, > >I have a 2 node test cluster. It exposes a single resource, an NFS >server which exports a single directory. I'm able to do: > >crm resource move <resource_name> > >and that works but if I do: > >pkill -9 'corosync|pacemaker' > >the resource doesn't migrate. > >I've been told by folks on the linux-ha IRC that fencing is my answer >and I've put in place the null fence client. I understand that this is >not what I'd want in production, but for my testing it seems to be the >correct way to test a cluster. I've confirmed in the good server's logs >that it believes it has successfully fenced its partner > >notice: log_operation: Operation 'reboot' [24621] (call 0 from >crmd.22546) for host 'nebula04' with device 'st-null' returned: 0 (OK) > >Am I mistaken that the stonith:null resource agent should allow the >system to believe that the "failed" server has been fenced and, >therefore, it is safe to migrate the resources? Note the script that >issues the pkill also stops the resources (so there aren't 2 VIPs, etc...). > >Thanks much for any insight. > >JR >_______________________________________________ >Linux-HA mailing list >[email protected] >http://lists.linux-ha.org/mailman/listinfo/linux-ha >See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
