On 08/08/2013, at 12:37 AM, RaSca <[email protected]> wrote:
> Hi all, > I have a big Pacemaker (1.1.9-1512) cluster with 9 nodes and almost 200 > virtual machines (with the same storage on the bottom). Everything is > based upon KVM and libvirt. > Each VM has got a location, based upon a cloned ping resource on each > node that pings three hosts on the net. > > The problem I got is that when I clone a VM (using virt-clone) > everything works fine until I try to add a new ping check. Can you more precisely describe what you mean by this? > At this time, for some reason the master ping resource of the node > fails, with errors like this: > > Jul 30 15:34:58 kvm09 lrmd[23467]: warning: child_timeout_callback: > res_ping_connections_monitor_5000 process (PID 26406) timed out > > We're investigating on potentially network problems (obviously the > network men says that those are impossible, but when the problems > happens there are sometimes high ping latencies on the node), but what I > find very strange is that things breaks up ONLY when I add a location > based upon ping, not for example when I add the storage's order and > colocation for VM. > > So my two questions: > > 1) Are there limitations about how many ping location can be declared? Well, there is a finite number of hosts that can be ping'd within a given interval. Is your timeout too short perhaps? Are you using fping which works in parallel? > 2) Is this one (one vm = one ping location) the best practice to monitor > the connections of the nodes? ping resources were intended to check if a cluster node could reach the outside world. You're using them to check if a VM resource is alive? Perhaps David's remote-node stuff would be better suited. > > Thanks for your help, > > -- > RaSca > Mia Mamma Usa Linux: Niente รจ impossibile da capire, se lo spieghi bene! > [email protected] > http://www.miamammausalinux.org > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
