Hi, yeah - I solved it. It was a stupid mistake: in /etc/ha.d/ha.cf one node was defined in the midst of comments again... Because all nodes already are defined at top of ha.cf one node (called zentgpfsn01) exists twice in the cluster.
And I looked for wrong UUID for hours.. :) Thanks for your help, guys! Cheers Junko IKEDA schrieb:
it looks very strange. the UUID of node1 is set correct (the same like issetin cib.xml). is this ID stored in /var/lib/heartbeat/hb_uuid? zentgpfsn01:~ # crm_uuid 07ca44ca-1bf5-4f12-8680-21f86c2e6bca zentgpfsn01:~ # grep zentgpfsn01 /var/lib/heartbeat/crm/cib.xml <node uname="zentgpfsn01" type="normal" id="07ca44ca-1bf5-4f12-8680-21f86c2e6bca"> in /var/lib/heartbeat/hostcache also the right UUqIDs are set: zentgpfsn01:~ # less /var/lib/heartbeat/hostcache zentgpfsn01 07ca44ca-1bf5-4f12-8680-21f86c2e6bca 100 zentgpfsn02 f44cbb3e-fa3c-4f93-b433-0c9eb4bb5cba 100 zentgpfsn03 7aa4698a-a17a-4c5b-8cfe-f7226a21aee8 100 when I start heartbeat on node1, /var/lib/heartbeat/hostcache looks like followingHi, Could you try to remove "/var/lib/heartbeat/hostcache" before starting Heartbeat as Andrew says? It might be needed for all nodes. I think I encountered the similar error when I tried to replace some nodes. At that time, hb_delnode command, or remove hostcash was effective. Thanks, Junkozentgpfsn01:~ # less /var/lib/heartbeat/hostcache zentgpfsn01 07ca44ca-1bf5-4f12-8680-21f86c2e6bca 100 zentgpfsn02 f44cbb3e-fa3c-4f93-b433-0c9eb4bb5cba 100 zentgpfsn03 7aa4698a-a17a-4c5b-8cfe-f7226a21aee8 100 zentgpfsn01 00000000-0000-0000-0000-000000000000 100 it seems as node1 can not found its own UUID and start w/o one. that maythereason for entries in logfile and the reboot: ccm[19837]: 2008/04/30_09:02:05 ERROR: llm_add: adding samenode(zentgpfsn01)twice(?) ccm[19837]: 2008/04/30_09:02:05 ERROR: set_llm_from_heartbeat: adding node zentgpfsn01 to llm failed ccm[19837]: 2008/04/30_09:02:05 ERROR: Initialization failed. Exit heartbeat[19822]: 2008/04/30_09:02:05 WARN: Managed/usr/lib64/heartbeat/ccmprocess 19837 exited with return code 1. heartbeat[19822]: 2008/04/30_09:02:05 EMERG: Rebooting system. Reason: /usr/lib64/heartbeat/ccm but when the system is up again, the same UUID is still set: zentgpfsn01:~ # crm_uuid 07ca44ca-1bf5-4f12-8680-21f86c2e6bca anybody an idea? i'm a bit helpless.... Dominik Klein schrieb:here is the cause:ccm[19751]: 2008/04/29_10:59:59 ERROR: llm_add: adding same node(zentgpfsn01) twice(?) ccm[19751]: 2008/04/29_10:59:59 ERROR: set_llm_from_heartbeat: adding node zentgpfsn01 to llm failed ccm[19751]: 2008/04/29_10:59:59 ERROR: Initialization failed. Exit heartbeat[19737]: 2008/04/29_10:59:59 WARN: Managed /usr/lib64/heartbeat/ccm process 19751 exited with return code 1.you don't have more than one machine with the same name by any chance?I saw something like this when I recently re-installed my testcluster and used an old (backup) configuration file. The uuid changed and so every node was there twice which ended in quite a mess. Regards Dominik _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems-- Mit besten Grüßen / Best Regards Alexander Födisch Max Planck Institute for Evolutionary Anthropology -Central IT Department- Deutscher Platz 6 D-04103 Leipzig Phone: +49 (0)341 3550-168 +49 (0)341 3550-154 Fax: +49 (0)341 3550-119 Email: [EMAIL PROTECTED]_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
--
Mit besten Grüßen / Best Regards
Alexander Födisch
Max Planck Institute for Evolutionary Anthropology
-Central IT Department-
Deutscher Platz 6
D-04103 Leipzig
Phone: +49 (0)341 3550-168
+49 (0)341 3550-154
Fax: +49 (0)341 3550-119
Email: [EMAIL PROTECTED]
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
