Hi, On Wed, Dec 07, 2011 at 04:56:31PM +0600, Aleksey V. Kashin wrote: > Hello. > > I have two servers (radius1, radius2). I've set up the cluster resource > - IPaddr2. I used next commands to set up this resource: > > # crm configure property stonith-enabled="false"
For a 2-node cluster disabling stonith is really bad. > # crm configure property no-quorum-policy="ignore" > # crm configure primitive raddb_ip ocf:heartbeat:IPaddr2 params > ip="10.99.2.57" cidr_netmask="32" op monitor interval="15s" > # crm configure group raddb raddb_ip > # crm configure location raddb-prefers-radius1 raddb inf: radius1 > # crm configure rsc_defaults resource-stickiness=1000001 > > All ok. > > But sometimes on server radius1 the load is increasing and server is > swapping and at that moment resource becomes "(unmanaged) FAILED". Below > I've presented example "unmanaged" resource: > > # crm_mon > ============ > Last updated: Wed Dec 7 14:56:20 2011 > Stack: openais > Current DC: radius1 - partition with quorum > Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f > 2 Nodes configured, 2 expected votes > 1 Resources configured. > ============ > > Online: [ radius2 radius1 ] > > Resource Group: raddb > raddb_ip (ocf::heartbeat:IPaddr2): Started radius1 > (unmanaged) FAILED > > Failed actions: > raddb_ip_monitor_15000 (node=radius1, call=4, rc=-2, status=Timed > Out): unknown exec error > raddb_ip_stop_0 (node=radius1, call=5, rc=-2, status=Timed Out): > unknown exec error > > > I've presented part of /var/log/syslog (radius1) here - > http://paste.org/41963 > > > In that moment ip address 10.99.2.57 is alive and server responds to > requests coming to this ip. However sometimes this resource becomes > completely unavailable and I restart corosync. It's very bad. > > I think resource becomes unmanaged because server is using swap and part > of corosync processes is in swap. I tested this suggestion and when > server is using a lot of swap resource becomes "unmanaged". corosync gets swapped? How interesting. > I use debian gnu/linux 5.x and this packages - > http://people.debian.org/~madkiss/ha/: > > # dpkg -l |grep cluster > ii cluster-glue > 1.0.7+hg2618-2~bpo50+1 The reusable cluster components for Linux HA > ii corosync > 1.4.2-1~bpo50+1 Standards-based cluster framework (daemon an > ii libcluster-glue > 1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries (transitional pac > ii libcorosync4 > 1.4.2-1~bpo50+1 Standards-based cluster framework (libraries > ii libcrmcluster1 > 1.1.5-3~bpo50+1 Pacemaker libraries - CRM > ii liblrm2 > 1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- liblrm2 > ii libpils2 > 1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- libpils2 > ii libplumb2 > 1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- libplumb2 > ii libplumbgpl2 > 1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- libplumbgpl2 > ii libstonith1 > 1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- libstonith1 > ii pacemaker > 1.1.5-3~bpo50+1 HA cluster resource manager > > > > I can't increase ram on this servers. How can I do that resource isn't > becomes "unmanaged/failed" ? Buy more memory. If you cannot, then I don't see any point in using clustering. Thanks, Dejan > With Best Regards. > Aleksey V. Kashin > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
