> it looks very strange. the UUID of node1 is set correct (the same like is set > in cib.xml). > is this ID stored in /var/lib/heartbeat/hb_uuid? > > > zentgpfsn01:~ # crm_uuid > 07ca44ca-1bf5-4f12-8680-21f86c2e6bca > > zentgpfsn01:~ # grep zentgpfsn01 /var/lib/heartbeat/crm/cib.xml > <node uname="zentgpfsn01" type="normal" > id="07ca44ca-1bf5-4f12-8680-21f86c2e6bca"> > > > > in /var/lib/heartbeat/hostcache also the right UUqIDs are set: > > zentgpfsn01:~ # less /var/lib/heartbeat/hostcache > zentgpfsn01 07ca44ca-1bf5-4f12-8680-21f86c2e6bca 100 > zentgpfsn02 f44cbb3e-fa3c-4f93-b433-0c9eb4bb5cba 100 > zentgpfsn03 7aa4698a-a17a-4c5b-8cfe-f7226a21aee8 100 > > > when I start heartbeat on node1, /var/lib/heartbeat/hostcache looks like > following
Hi, Could you try to remove "/var/lib/heartbeat/hostcache" before starting Heartbeat as Andrew says? It might be needed for all nodes. I think I encountered the similar error when I tried to replace some nodes. At that time, hb_delnode command, or remove hostcash was effective. Thanks, Junko > > zentgpfsn01:~ # less /var/lib/heartbeat/hostcache > zentgpfsn01 07ca44ca-1bf5-4f12-8680-21f86c2e6bca 100 > zentgpfsn02 f44cbb3e-fa3c-4f93-b433-0c9eb4bb5cba 100 > zentgpfsn03 7aa4698a-a17a-4c5b-8cfe-f7226a21aee8 100 > zentgpfsn01 00000000-0000-0000-0000-000000000000 100 > > > it seems as node1 can not found its own UUID and start w/o one. that may the > reason for entries in logfile and the reboot: > > ccm[19837]: 2008/04/30_09:02:05 ERROR: llm_add: adding same node(zentgpfsn01) > twice(?) > ccm[19837]: 2008/04/30_09:02:05 ERROR: set_llm_from_heartbeat: adding node > zentgpfsn01 to llm failed > ccm[19837]: 2008/04/30_09:02:05 ERROR: Initialization failed. Exit > heartbeat[19822]: 2008/04/30_09:02:05 WARN: Managed /usr/lib64/heartbeat/ccm > process 19837 exited with return code 1. > heartbeat[19822]: 2008/04/30_09:02:05 EMERG: Rebooting system. Reason: > /usr/lib64/heartbeat/ccm > > > > but when the system is up again, the same UUID is still set: > > zentgpfsn01:~ # crm_uuid > 07ca44ca-1bf5-4f12-8680-21f86c2e6bca > > > > anybody an idea? i'm a bit helpless.... > > > Dominik Klein schrieb: > >> here is the cause: > >> > >>> ccm[19751]: 2008/04/29_10:59:59 ERROR: llm_add: adding same > >>> node(zentgpfsn01) twice(?) > >>> ccm[19751]: 2008/04/29_10:59:59 ERROR: set_llm_from_heartbeat: > >>> adding node > >>> zentgpfsn01 to llm failed > >>> ccm[19751]: 2008/04/29_10:59:59 ERROR: Initialization failed. Exit > >>> heartbeat[19737]: 2008/04/29_10:59:59 WARN: Managed > >>> /usr/lib64/heartbeat/ccm process 19751 exited with return code 1. > >> > >> you don't have more than one machine with the same name by any chance? > > > > I saw something like this when I recently re-installed my testcluster > > and used an old (backup) configuration file. The uuid changed and so > > every node was there twice which ended in quite a mess. > > > > Regards > > Dominik > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > > > > -- > Mit besten Grüßen / Best Regards > > Alexander Födisch > > Max Planck Institute for Evolutionary Anthropology > -Central IT Department- > Deutscher Platz 6 > D-04103 Leipzig > > Phone: +49 (0)341 3550-168 > +49 (0)341 3550-154 > Fax: +49 (0)341 3550-119 > Email: [EMAIL PROTECTED] _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
