Alexander Födisch wrote:
Hi all,

I'm new here at the mailinglist and I have a problem with my three-node cluster.

When I start heartbeat on one of the three nodes the system always does a reboot. In the log I found the error "register_with_ha: get_uuid_by_name() failed". But I can not remember that I did some changes. I also restored all files of /var/lib/heartbeat/crm/ from backup - w/o success... :(

http://wiki.linux-ha.org/GettingStartedV2?action=show&redirect=ClusterResourceManager%2FSetup#head-1b042c1eff7e093647f638b07618e4112a9405d9

Regards
Dominik


~# tail /var/log/ha-debug
[...]
lrmd[19753]: 2008/04/29_10:59:59 WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability lrmd[19753]: 2008/04/29_10:59:59 WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability lrmd[19753]: 2008/04/29_10:59:59 info: G_main_add_SignalHandler: Added signal handler for signal 10 cib[19752]: 2008/04/29_10:59:59 info: log_data_element: readCibXmlFile: [on-disk] <status/> lrmd[19753]: 2008/04/29_10:59:59 info: G_main_add_SignalHandler: Added signal handler for signal 12 cib[19752]: 2008/04/29_10:59:59 info: log_data_element: readCibXmlFile: [on-disk] </cib>
lrmd[19753]: 2008/04/29_10:59:59 info: Started.
mgmtd[19757]: 2008/04/29_10:59:59 WARN: Core dumps could be lost if multiple dumps occur. mgmtd[19757]: 2008/04/29_10:59:59 WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability mgmtd[19757]: 2008/04/29_10:59:59 WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability mgmtd[19757]: 2008/04/29_10:59:59 info: G_main_add_SignalHandler: Added signal handler for signal 10 mgmtd[19757]: 2008/04/29_10:59:59 info: G_main_add_SignalHandler: Added signal handler for signal 12 attrd[19755]: 2008/04/29_10:59:59 WARN: get_uuid: Could not calculate UUID for zentgpfsn01

-> I think here is the reason:
attrd[19755]: 2008/04/29_10:59:59 ERROR: register_with_ha: get_uuid_by_name() failed
attrd[19755]: 2008/04/29_10:59:59 ERROR: main: HA Signon failed
attrd[19755]: 2008/04/29_10:59:59 ERROR: main: Aborting startup

heartbeat[19737]: 2008/04/29_10:59:59 WARN: Managed /usr/lib64/heartbeat/attrd process 19755 exited with return code 100.
mgmtd[19757]: 2008/04/29_10:59:59 info: init_crm
mgmtd[19757]: 2008/04/29_10:59:59 info: login to cib: 0, ret:-10
cib[19752]: 2008/04/29_10:59:59 notice: readCibXmlFile: Enabling DTD validation on the existing (sane) configuration cib[19752]: 2008/04/29_10:59:59 info: startCib: CIB Initialization completed successfully cib[19752]: 2008/04/29_10:59:59 info: cib_register_ha: Signing in with Heartbeat ccm[19751]: 2008/04/29_10:59:59 ERROR: llm_add: adding same node(zentgpfsn01) twice(?) ccm[19751]: 2008/04/29_10:59:59 ERROR: set_llm_from_heartbeat: adding node zentgpfsn01 to llm failed
ccm[19751]: 2008/04/29_10:59:59 ERROR: Initialization failed. Exit
heartbeat[19737]: 2008/04/29_10:59:59 WARN: Managed /usr/lib64/heartbeat/ccm process 19751 exited with return code 1. cib[19752]: 2008/04/29_10:59:59 info: cib_register_ha: FSA Hostname: zentgpfsn01

-> and game over :)
heartbeat[19737]: 2008/04/29_10:59:59 EMERG: Rebooting system. Reason: /usr/lib64/heartbeat/ccm

Anybody an idea?

Thanks!
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to