Hi, What's happening is that corosync is forking but the exec is not happening.
I used to see this problem in my case when syslog-ng process was not running. Try checking that and starting it and then start corosync. Sincerely Shravan On Wed, Dec 22, 2010 at 4:43 AM, Daniel Bareiro <daniel-lis...@gmx.net> wrote: > Hi all! > > I hope this is the right group to discuss my problem. > > I'm beginning to test HA clusters with Debian GNU/Linux and for that I > decided to try Pacemaker + Corosync with Debian Lenny amd64 following > this [1] howto. > > Both packages were installed from the Backports repositories. But I am > observing that if after configuration I reboot a node, it fails to join > to the cluster after the boot. > > This is what I see in /var/log/daemon.log: > > -------------------------------------------------------------------------- > Dec 19 17:13:13 atlantis corosync[1508]: [pcmk ] WARN: route_ais_message: > Sending message to local.crmd failed: unknown (rc=-2) > Dec 19 17:13:13 atlantis corosync[1508]: [pcmk ] WARN: route_ais_message: > Sending message to local.cib failed: unknown (rc=-2) > Dec 19 17:13:13 atlantis corosync[1508]: [pcmk ] WARN: route_ais_message: > Sending message to local.attrd failed: unknown (rc=-2) > Dec 19 17:13:13 atlantis corosync[1508]: [pcmk ] WARN: route_ais_message: > Sending message to local.cib failed: unknown (rc=-2) > Dec 19 17:13:14 atlantis corosync[1508]: [pcmk ] WARN: route_ais_message: > Sending message to local.cib failed: unknown (rc=-2) > Dec 19 17:13:14 atlantis corosync[1508]: [pcmk ] WARN: route_ais_message: > Sending message to local.cib failed: unknown (rc=-2) > Dec 19 17:13:21 atlantis corosync[1508]: [TOTEM ] A processor failed, > forming new configuration. > Dec 19 17:13:25 atlantis corosync[1508]: [pcmk ] notice: pcmk_peer_update: > Transitional membership event on ring 72: memb=1, new=0, lost=1 > Dec 19 17:13:25 atlantis corosync[1508]: [pcmk ] info: pcmk_peer_update: > memb: atlantis 335544586 > Dec 19 17:13:25 atlantis corosync[1508]: [pcmk ] info: pcmk_peer_update: > lost: daedalus 369099018 > Dec 19 17:13:25 atlantis corosync[1508]: [pcmk ] notice: pcmk_peer_update: > Stable membership event on ring 72: memb=1, new=0, lost=0 > Dec 19 17:13:25 atlantis corosync[1508]: [pcmk ] info: pcmk_peer_update: > MEMB: atlantis 335544586 > Dec 19 17:13:25 atlantis corosync[1508]: [pcmk ] info: > ais_mark_unseen_peer_dead: Node daedalus was not seen in the previous > transition > Dec 19 17:13:25 atlantis corosync[1508]: [pcmk ] info: update_member: Node > 369099018/daedalus is now: lost > Dec 19 17:13:25 atlantis corosync[1508]: [pcmk ] info: > send_member_notification: Sending membership update 72 to 0 children > Dec 19 17:13:25 atlantis corosync[1508]: [TOTEM ] A processor joined or > left the membership and a new membership was formed. > Dec 19 17:13:25 atlantis corosync[1508]: [MAIN ] Completed service > synchronization, ready to provide service. > -------------------------------------------------------------------------- > > # ps auxf > [...] > root 1508 0.1 1.9 182624 4880 ? Ssl 15:52 0:22 > /usr/sbin/corosync > root 1539 0.0 1.2 168144 3240 ? S 15:52 0:00 \_ > /usr/sbin/corosync > root 1540 0.0 1.2 168144 3240 ? S 15:52 0:00 \_ > /usr/sbin/corosync > root 1541 0.0 1.2 168144 3240 ? S 15:52 0:00 \_ > /usr/sbin/corosync > root 1542 0.0 1.2 168144 3240 ? S 15:52 0:00 \_ > /usr/sbin/corosync > root 1543 0.0 1.2 168144 3240 ? S 15:52 0:00 \_ > /usr/sbin/corosync > root 1544 0.0 1.2 168144 3240 ? S 15:52 0:00 \_ > /usr/sbin/corosync > > > From what I see in the howto, the output should be something like this: > > > root 29980 0.0 0.8 44304 3808 ? Ssl 20:55 0:00 > /usr/sbin/corosync > root 29986 0.0 2.4 10812 10812 ? SLs 20:55 0:00 \_ > /usr/lib/heartbeat/stonithd > 102 29987 0.0 0.8 13012 3804 ? S 20:55 0:00 \_ > /usr/lib/heartbeat/cib > root 29988 0.0 0.4 5444 1800 ? S 20:55 0:00 \_ > /usr/lib/heartbeat/lrmd > 102 29989 0.0 0.5 12364 2368 ? S 20:55 0:00 \_ > /usr/lib/heartbeat/attrd > 102 29990 0.0 0.5 8604 2304 ? S 20:55 0:00 \_ > /usr/lib/heartbeat/pengine > 102 29991 0.0 0.6 12648 3080 ? S 20:55 0:00 \_ > /usr/lib/heartbeat/crmd > > > > I also tried compiling Pacemaker using these [2] steps, but I get the > same result. > > > > Thanks in advance for your reply. > > Regards, > Daniel > > [1] http://www.clusterlabs.org/wiki/Debian_Lenny_HowTo > [2] http://www.clusterlabs.org/wiki/Install#Building_from_Source > -- > Daniel Bareiro - GNU/Linux registered user #188.598 > Proudly running Debian GNU/Linux with uptime: > 06:39:43 up 70 days, 7:06, 10 users, load average: 0.27, 0.16, 0.10 > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (GNU/Linux) > > iEYEARECAAYFAk0RyFMACgkQZpa/GxTmHTdq2gCeOqGj1PyjqJ+bcvIcuhmJKmSw > nHsAoI4PhhYTW4v5jjq0JCdIfxOyl/PH > =sAH7 > -----END PGP SIGNATURE----- > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker