Re: [Linux-HA] unable to recover from split-brain in a two-node cluster

2014-06-24 Thread Lars Ellenberg
On Tue, Jun 24, 2014 at 12:23:30PM +1000, Andrew Beekhof wrote: On 24 Jun 2014, at 1:52 am, f...@vmware.com wrote: Hi, I understand that initially the split-brain is caused by heartbeat messaging layer and there is nothing much can be done when packets are dropped. However, the

Re: [Linux-HA] unable to recover from split-brain in a two-node cluster

2014-06-24 Thread fank
Hi Andrew, I do see the last status update from crmd as following on node-1 from crmd is but crm_mon -1 still shows node-0 offline: crmd_ha_status_callback: Status update: Node node-0 now has status [active] [DC=false] Same on node-0 showing node-1 now has status active but crm_mon -1 shows it

Re: [Linux-HA] unable to recover from split-brain in a two-node cluster

2014-06-24 Thread fank
Hi Lars, Thanks for pointing out the patch. It is not in the heartbeat version on the system (it is using Heartbeat-3-0-7e3a82377fa8). I'll try that out. As for ccm_testclient, the system has stripped out unnecessary files that won't be used during normal operation, including gcc. So

Re: [Linux-HA] unable to recover from split-brain in a two-node cluster

2014-06-24 Thread Andrew Beekhof
On 25 Jun 2014, at 12:03 am, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Tue, Jun 24, 2014 at 12:23:30PM +1000, Andrew Beekhof wrote: On 24 Jun 2014, at 1:52 am, f...@vmware.com wrote: Hi, I understand that initially the split-brain is caused by heartbeat messaging layer and