Hi all,

I'm new here at the mailinglist and I have a problem with my three-node cluster.

When I start heartbeat on one of the three nodes the system always does a reboot. In the log I found the error "register_with_ha: get_uuid_by_name() failed". But I can not remember that I did some changes. I also restored all files of /var/lib/heartbeat/crm/ from backup - w/o success... :(


~# tail /var/log/ha-debug
[...]
lrmd[19753]: 2008/04/29_10:59:59 WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability
lrmd[19753]: 2008/04/29_10:59:59 WARN: Consider setting 
/proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
lrmd[19753]: 2008/04/29_10:59:59 info: G_main_add_SignalHandler: Added signal 
handler for signal 10
cib[19752]: 2008/04/29_10:59:59 info: log_data_element: readCibXmlFile: [on-disk]   
<status/>
lrmd[19753]: 2008/04/29_10:59:59 info: G_main_add_SignalHandler: Added signal 
handler for signal 12
cib[19752]: 2008/04/29_10:59:59 info: log_data_element: readCibXmlFile: [on-disk] 
</cib>
lrmd[19753]: 2008/04/29_10:59:59 info: Started.
mgmtd[19757]: 2008/04/29_10:59:59 WARN: Core dumps could be lost if multiple 
dumps occur.
mgmtd[19757]: 2008/04/29_10:59:59 WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability
mgmtd[19757]: 2008/04/29_10:59:59 WARN: Consider setting 
/proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability
mgmtd[19757]: 2008/04/29_10:59:59 info: G_main_add_SignalHandler: Added signal 
handler for signal 10
mgmtd[19757]: 2008/04/29_10:59:59 info: G_main_add_SignalHandler: Added signal 
handler for signal 12
attrd[19755]: 2008/04/29_10:59:59 WARN: get_uuid: Could not calculate UUID for 
zentgpfsn01

-> I think here is the reason:
attrd[19755]: 2008/04/29_10:59:59 ERROR: register_with_ha: get_uuid_by_name() 
failed
attrd[19755]: 2008/04/29_10:59:59 ERROR: main: HA Signon failed
attrd[19755]: 2008/04/29_10:59:59 ERROR: main: Aborting startup

heartbeat[19737]: 2008/04/29_10:59:59 WARN: Managed /usr/lib64/heartbeat/attrd 
process 19755 exited with return code 100.
mgmtd[19757]: 2008/04/29_10:59:59 info: init_crm
mgmtd[19757]: 2008/04/29_10:59:59 info: login to cib: 0, ret:-10
cib[19752]: 2008/04/29_10:59:59 notice: readCibXmlFile: Enabling DTD validation 
on the existing (sane) configuration
cib[19752]: 2008/04/29_10:59:59 info: startCib: CIB Initialization completed 
successfully
cib[19752]: 2008/04/29_10:59:59 info: cib_register_ha: Signing in with Heartbeat
ccm[19751]: 2008/04/29_10:59:59 ERROR: llm_add: adding same node(zentgpfsn01) 
twice(?)
ccm[19751]: 2008/04/29_10:59:59 ERROR: set_llm_from_heartbeat: adding node 
zentgpfsn01 to llm failed
ccm[19751]: 2008/04/29_10:59:59 ERROR: Initialization failed. Exit
heartbeat[19737]: 2008/04/29_10:59:59 WARN: Managed /usr/lib64/heartbeat/ccm 
process 19751 exited with return code 1.
cib[19752]: 2008/04/29_10:59:59 info: cib_register_ha: FSA Hostname: zentgpfsn01

-> and game over :)
heartbeat[19737]: 2008/04/29_10:59:59 EMERG: Rebooting system.  Reason: 
/usr/lib64/heartbeat/ccm

Anybody an idea?

Thanks!

--
Mit besten Grüßen / Best Regards

Alexander Födisch

Max Planck Institute for Evolutionary Anthropology
-Central IT Department-
Deutscher Platz 6
D-04103 Leipzig

Phone:  +49 (0)341 3550-168
        +49 (0)341 3550-154
Fax:    +49 (0)341 3550-119
Email:  [EMAIL PROTECTED]

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to