Turns out to be simple, and a result of my own stupidity

I recently increased the memory allocation to one VM. The result is that when a node fails, and its VMs fail over to the other node, there's simply not enough memory for the node to start. So.. it's not that the VM will run only on one node - it's that it won't run in combination certain other VMs. Backed off the memory requirement setting in the config file and all is copacetic.

So - short form:
- not enough real memory for the VM to start
- the resource agent / xen startup scripts don't seem to take swap space into account
- the error message doesn't propagate up to crm

Anyway... problem found and solved.

Miles Fidelman

2013/7/27 Miles Fidelman <mfidelman at meetinghouse.net <http://lists.linux-ha.org/mailman/listinfo/linux-ha>>
/>/ Dual-node, pacemaker cluster, DRBD-backed xen virtual machines - one of />/ our VMs will run on one node, but not the other, and "crm status" yields a />/ failure message saying that starting the resource failed for unknown
/>/  reasons.  The log is only slightly less useless:
/>/
/>/  (server2 and server3 are the nodes, server1 is the resource)
/>/  <server3, running server1, crashes>
/>/  <node entries from server2 trying to failover the resource>
/>/
/>/ Jul 27 06:27:06 server2 pengine: [1365]: info: get_failcount: server1 has
/>/  failed INFINITY times on server2
/>/ Jul 27 06:27:06 server2 pengine: [1365]: WARN: common_apply_stickiness: />/ Forcing server1 away from server2 after 1000000 failures (max=1000000) />/ Jul 27 06:27:06 server2 pengine: [1365]: info: native_color: Resource
/>/  server1 cannot run anywhere
/>/  Jul 27 06:27:06 server2 pengine: [1365]: notice: LogActions: Leave
/>/  resource server1#011(Stopped)
/>/
/>/ Attempts to migrate the server fail with the same errors. Failover USED />/ to work just fine. It still works for other VMs. Any idea how to track
/>/  down what's failing?
/>/
/>/  Thanks very much,
/>/
/>/  Miles Fidelman
/>



--
In theory, there is no difference between theory and practice.
In practice, there is.   .... Yogi Berra

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to