Hi all, I'm new here at the mailinglist and I have a problem with my three-node cluster.
When I start heartbeat on one of the three nodes the system always does a reboot. In the log I found the error "register_with_ha: get_uuid_by_name() failed". But I can not remember that I did some changes. I also restored all files of /var/lib/heartbeat/crm/ from backup - w/o success... :(
~# tail /var/log/ha-debug [...]lrmd[19753]: 2008/04/29_10:59:59 WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability
lrmd[19753]: 2008/04/29_10:59:59 WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability lrmd[19753]: 2008/04/29_10:59:59 info: G_main_add_SignalHandler: Added signal handler for signal 10 cib[19752]: 2008/04/29_10:59:59 info: log_data_element: readCibXmlFile: [on-disk] <status/> lrmd[19753]: 2008/04/29_10:59:59 info: G_main_add_SignalHandler: Added signal handler for signal 12 cib[19752]: 2008/04/29_10:59:59 info: log_data_element: readCibXmlFile: [on-disk] </cib> lrmd[19753]: 2008/04/29_10:59:59 info: Started. mgmtd[19757]: 2008/04/29_10:59:59 WARN: Core dumps could be lost if multiple dumps occur.mgmtd[19757]: 2008/04/29_10:59:59 WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability
mgmtd[19757]: 2008/04/29_10:59:59 WARN: Consider setting
/proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
mgmtd[19757]: 2008/04/29_10:59:59 info: G_main_add_SignalHandler: Added signal
handler for signal 10
mgmtd[19757]: 2008/04/29_10:59:59 info: G_main_add_SignalHandler: Added signal
handler for signal 12
attrd[19755]: 2008/04/29_10:59:59 WARN: get_uuid: Could not calculate UUID for
zentgpfsn01
-> I think here is the reason:
attrd[19755]: 2008/04/29_10:59:59 ERROR: register_with_ha: get_uuid_by_name()
failed
attrd[19755]: 2008/04/29_10:59:59 ERROR: main: HA Signon failed
attrd[19755]: 2008/04/29_10:59:59 ERROR: main: Aborting startup
heartbeat[19737]: 2008/04/29_10:59:59 WARN: Managed /usr/lib64/heartbeat/attrd
process 19755 exited with return code 100.
mgmtd[19757]: 2008/04/29_10:59:59 info: init_crm
mgmtd[19757]: 2008/04/29_10:59:59 info: login to cib: 0, ret:-10
cib[19752]: 2008/04/29_10:59:59 notice: readCibXmlFile: Enabling DTD validation
on the existing (sane) configuration
cib[19752]: 2008/04/29_10:59:59 info: startCib: CIB Initialization completed
successfully
cib[19752]: 2008/04/29_10:59:59 info: cib_register_ha: Signing in with Heartbeat
ccm[19751]: 2008/04/29_10:59:59 ERROR: llm_add: adding same node(zentgpfsn01)
twice(?)
ccm[19751]: 2008/04/29_10:59:59 ERROR: set_llm_from_heartbeat: adding node
zentgpfsn01 to llm failed
ccm[19751]: 2008/04/29_10:59:59 ERROR: Initialization failed. Exit
heartbeat[19737]: 2008/04/29_10:59:59 WARN: Managed /usr/lib64/heartbeat/ccm
process 19751 exited with return code 1.
cib[19752]: 2008/04/29_10:59:59 info: cib_register_ha: FSA Hostname: zentgpfsn01
-> and game over :)
heartbeat[19737]: 2008/04/29_10:59:59 EMERG: Rebooting system. Reason:
/usr/lib64/heartbeat/ccm
Anybody an idea?
Thanks!
--
Mit besten Grüßen / Best Regards
Alexander Födisch
Max Planck Institute for Evolutionary Anthropology
-Central IT Department-
Deutscher Platz 6
D-04103 Leipzig
Phone: +49 (0)341 3550-168
+49 (0)341 3550-154
Fax: +49 (0)341 3550-119
Email: [EMAIL PROTECTED]
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
