Dominik Klein wrote:
> Hi
> 
> please try to change "crm on" to "crm respawn" in /etc/ha.d/ha.cf

Hi Dominik. Thank you for the fast answer.

Setting crm to "respawn" at least prevent the machines from rebooting,
and gives me a fair chance to run other diagnostics.

I'm sorry, I just realized that I should have sent my configuration
files over with the logfiles. /etc/ha.d/ha.cf at this moment is

------------
debug 1
logfacility     daemon
keepalive 1
deadtime 10
warntime 5
initdead 120 # depend on your hardware
udpport 694
ping 1.2.3.254
bcast eth2
auto_failback off
node    db-mysql-test3-ha.ripe.net
node    db-mysql-test4-ha.ripe.net
use_logd yes
compression     bz2
compression_threshold 2
crm respawn
------------

I still have the plugin errors, like this:

------------
Mar  3 15:41:23 db-mysql-test3-ha crmd: [3141]: info: do_started: The
local CRM is operational
Mar  3 15:41:23 db-mysql-test3-ha crmd: [3141]: info:
do_state_transition: State transition S_STARTING -> S_PENDING [
input=I_PENDING cause=C_CCM_CALLBACK origin=do_started ]
Mar  3 15:41:23 db-mysql-test3-ha cib: [3137]: ERROR: generic plugin
load failed
Mar  3 15:41:23 db-mysql-test3-ha cib: [3137]: ERROR: cl_compress_field:
loading compression module failed
Mar  3 15:41:23 db-mysql-test3-ha cib: [3137]: ERROR:
uncompress2compress: compressing 5th field failed
Mar  3 15:41:23 db-mysql-test3-ha cib: [3137]: ERROR: generic plugin
load failed
Mar  3 15:41:23 db-mysql-test3-ha cib: [3137]: ERROR: cl_compress_field:
loading compression module failed
Mar  3 15:41:23 db-mysql-test3-ha cib: [3137]: ERROR:
uncompress2compress: compressing 6th field failed
Mar  3 15:41:23 db-mysql-test3-ha cib: [3137]: ERROR: generic plugin
load failed
Mar  3 15:41:23 db-mysql-test3-ha cib: [3137]: ERROR: cl_compress_field:
loading compression module failed
Mar  3 15:41:23 db-mysql-test3-ha cib: [3137]: ERROR:
uncompress2compress: compressing 7th field failed
------------

But there are no more reboots at the moment.

In an attempt to make everything simpler to debug, I re-configured my
cluster with a single resource, an IP address (from the suggested
configuration at http://linux-ha.org/GettingStartedV2/OneIPAddress).

crm_mon says none of my nodes are online, at the moment. It's being like
this for quite a while, already:

------------------------------------
============
Last updated: Mon Mar  3 15:45:28 2008
Current DC: db-mysql-test4-ha.ripe.net
(3b239a82-e57c-4283-9573-731909d60e18)
2 Nodes configured.
0 Resources configured.
============

Node: db-mysql-test4-ha.ripe.net (3b239a82-e57c-4283-9573-731909d60e18):
OFFLINE
Node: db-mysql-test3-ha.ripe.net (0f24958a-37b1-495a-84fc-911fe61720de):
OFFLINE
------------------------------------

and cibadmin -Q times out (of course), saying it can't connect to the
cluster.

Finally, crm_verify says

-------------
crm_verify[3232]: 2008/03/03_15:55:11 WARN: cluster_status: We do not
have quorum - fencing and resource management disabled
-------------

Finally, I've investigated my connectivity between the two nodes, and
everything seems fine on the network layer: I can see the other machine
(both sides) and there is no packet filtering firewalls running on them
(and there isn't firewalls in general in this network).

And that's it for now. :(

Thanks for your help, but it seems that I still need some. Will start
googling for more information on my current errors.

Kind regards.
-- 
Luis Motta Campos (a.k.a. Monsieur Champs) is a software engineer,
Perl fanatic evangelist, and amateur {cook, photographer}

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to