Hi guys,

I installed linux heartbeat into one machine, the problem is after we
started the heartbeat for several seconds, the machine is rebooted, I can
see the problem is the configuration cib.xml is not valid, but is it a right
behavior that the machine with invalid cib.xml will be reset? Btw the
STONITH is disabled, attached the log. I am also wondering if the behavior
is right, can I disable it to reset as we are with a server machine, the
reboot process is painful slow.

Thanks.
Bin

Dec 27 23:40:57 ucs22 lrmd: [4993]: WARN: Initializing connection to logging
daemon failed. Logging daemon may not be running
Dec 27 23:40:57 ucs22 lrmd: [4993]: info: G_main_add_SignalHandler: Added
signal handler for signal 15
Dec 27 23:40:57 ucs22 lrmd: [4993]: info: G_main_add_SignalHandler: Added
signal handler for signal 17
Dec 27 23:40:57 ucs22 lrmd: [4993]: info: enabling coredumps
Dec 27 23:40:57 ucs22 lrmd: [4993]: info: G_main_add_SignalHandler: Added
signal handler for signal 10
Dec 27 23:40:57 ucs22 lrmd: [4993]: info: G_main_add_SignalHandler: Added
signal handler for signal 12
Dec 27 23:40:57 ucs22 lrmd: [4993]: info: Started.
Dec 27 23:40:57 ucs22 stonithd: [4994]: WARN: Initializing connection to
logging daemon failed. Logging daemon may not be running
Dec 27 23:40:57 ucs22 stonithd: [4994]: info: G_main_add_SignalHandler:
Added signal handler for signal 10
Dec 27 23:40:57 ucs22 stonithd: [4994]: info: G_main_add_SignalHandler:
Added signal handler for signal 12
Dec 27 23:40:57 ucs22 attrd: [4995]: info: crm_log_init: Changed active
directory to /var/lib/heartbeat/cores/hacluster
Dec 27 23:40:57 ucs22 attrd: [4995]: WARN: Initializing connection to
logging daemon failed. Logging daemon may not be running
Dec 27 23:40:57 ucs22 attrd: [4995]: info: Invoked:
/usr/lib64/heartbeat/attrd
Dec 27 23:40:57 ucs22 cib: [4992]: info: crm_log_init: Changed active
directory to /var/lib/heartbeat/cores/hacluster
Dec 27 23:40:57 ucs22 crmd: [4996]: info: crm_log_init: Changed active
directory to /var/lib/heartbeat/cores/hacluster
Dec 27 23:40:57 ucs22 ccm: [4991]: WARN: Initializing connection to logging
daemon failed. Logging daemon may not be running
Dec 27 23:40:57 ucs22 stonithd: [4994]: info: register_heartbeat_conn:
Hostname: ucs22
Dec 27 23:40:57 ucs22 attrd: [4995]: info: main: Starting up
Dec 27 23:40:57 ucs22 cib: [4992]: WARN: Initializing connection to logging
daemon failed. Logging daemon may not be running
Dec 27 23:40:57 ucs22 crmd: [4996]: WARN: Initializing connection to logging
daemon failed. Logging daemon may not be running
Dec 27 23:40:57 ucs22 ccm: [4991]: info: Hostname: ucs22
Dec 27 23:40:57 ucs22 stonithd: [4994]: info: register_heartbeat_conn: UUID:
b8fc4074-c40e-48e4-80ad-a9b63fd4bf77
Dec 27 23:40:57 ucs22 cib: [4992]: info: Invoked: /usr/lib64/heartbeat/cib
Dec 27 23:40:57 ucs22 crmd: [4996]: info: Invoked: /usr/lib64/heartbeat/crmd

Dec 27 23:40:57 ucs22 stonithd: [4994]: info: crm_cluster_connect:
Connecting to Heartbeat
Dec 27 23:40:57 ucs22 cib: [4992]: info: G_main_add_TriggerHandler: Added
signal manual handler
Dec 27 23:40:57 ucs22 crmd: [4996]: info: main: CRM Hg Version:
da7075976b5ff0bee71074385f8fd02f296ec8a3
Dec 27 23:40:57 ucs22 cib: [4992]: info: G_main_add_SignalHandler: Added
signal handler for signal 17
Dec 27 23:40:57 ucs22 crmd: [4996]: info: crmd_init: Starting crmd
Dec 27 23:40:57 ucs22 cib: [4992]: ERROR: crm_is_writable:
/var/lib/heartbeat/crm/cib.xml must be owned and r/w by user hacluster
Dec 27 23:40:57 ucs22 crmd: [4996]: info: G_main_add_SignalHandler: Added
signal handler for signal 17
Dec 27 23:40:57 ucs22 cib: [4992]: info: retrieveCib: Reading cluster
configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
/var/lib/heartbeat/crm/cib.xml.sig)
Dec 27 23:40:57 ucs22 cib: [4992]: WARN: validate_cib_digest: No on-disk
digest present
Dec 27 23:40:57 ucs22 cib: [4992]: ERROR: Expecting an element nodes, got
nothing
Dec 27 23:40:57 ucs22 cib: [4992]: ERROR: Invalid sequence in interleave
Dec 27 23:40:57 ucs22 cib: [4992]: ERROR: Element configuration failed to
validate content
Dec 27 23:40:57 ucs22 cib: [4992]: ERROR: Element cib failed to validate
content
Dec 27 23:40:57 ucs22 cib: [4992]: ERROR: readCibXmlFile: CIB does not
validate with pacemaker-1.0
Dec 27 23:40:57 ucs22 cib: [4992]: info: startCib: CIB Initialization
completed successfully
Dec 27 23:40:57 ucs22 attrd: [4995]: info: register_heartbeat_conn:
Hostname: ucs22
Dec 27 23:40:57 ucs22 attrd: [4995]: info: register_heartbeat_conn: UUID:
b8fc4074-c40e-48e4-80ad-a9b63fd4bf77
Dec 27 23:40:57 ucs22 attrd: [4995]: info: crm_cluster_connect: Connecting
to Heartbeat
Dec 27 23:40:57 ucs22 attrd: [4995]: info: main: Cluster connection active
Dec 27 23:40:57 ucs22 attrd: [4995]: info: main: Accepting attribute updates
Dec 27 23:40:57 ucs22 attrd: [4995]: info: main: Starting mainloop...
Dec 27 23:40:57 ucs22 stonithd: [4994]: notice:
/usr/lib64/heartbeat/stonithd start up successfully.
Dec 27 23:40:57 ucs22 stonithd: [4994]: info: G_main_add_SignalHandler:
Added signal handler for signal 17
Dec 27 23:40:57 ucs22 cib: [4992]: info: register_heartbeat_conn: Hostname:
ucs22
Dec 27 23:40:57 ucs22 cib: [4992]: info: register_heartbeat_conn: UUID:
b8fc4074-c40e-48e4-80ad-a9b63fd4bf77
Dec 27 23:40:57 ucs22 cib: [4992]: info: crm_cluster_connect: Connecting to
Heartbeat
Dec 27 23:40:57 ucs22 cib: [4992]: info: ccm_connect: Registering with
CCM...
Dec 27 23:40:57 ucs22 cib: [4992]: WARN: ccm_connect: CCM Activation failed
Dec 27 23:40:57 ucs22 cib: [4992]: WARN: ccm_connect: CCM Connection failed
1 times (30 max)
Dec 27 23:40:58 ucs22 crmd: [4996]: info: do_cib_control: Could not connect
to the CIB service: connection failed
Dec 27 23:40:58 ucs22 crmd: [4996]: WARN: do_cib_control: Couldn't complete
CIB registration 1 times... pause and retry
Dec 27 23:40:58 ucs22 crmd: [4996]: info: crmd_init: Starting crmd's
mainloop
Dec 27 23:40:58 ucs22 ccm: [4991]: info: G_main_add_SignalHandler: Added
signal handler for signal 15
Dec 27 23:41:00 ucs22 crmd: [4996]: info: crm_timer_popped: Wait Timer
(I_NULL) just popped!
Dec 27 23:41:00 ucs22 cib: [4992]: info: ccm_connect: Registering with
CCM...
Dec 27 23:41:00 ucs22 cib: [4992]: info: cib_init: Requesting the list of
configured nodes
Dec 27 23:41:00 ucs22 cib: [4992]: info: cib_init: Starting cib mainloop
Dec 27 23:41:00 ucs22 cib: [4992]: info: cib_client_status_callback: Status
update: Client ucs22/cib now has status [join]
Dec 27 23:41:00 ucs22 cib: [4992]: info: crm_new_peer: Node 0 is now known
as ucs22
Dec 27 23:41:00 ucs22 cib: [4992]: info: crm_update_peer_proc: ucs22.cib is
now online
Dec 27 23:41:01 ucs22 cib: [4992]: info: cib_client_status_callback: Status
update: Client ucs22/cib now has status [online]
Dec 27 23:41:01 ucs22 crmd: [4996]: info: do_cib_control: CIB connection
established
Dec 27 23:41:01 ucs22 cib: [4992]: ERROR: cib_process_request: Operation
ignored, cluster configuration is invalid. Please repair and restart: Update
does not conform to the configured schema/DTD
Dec 27 23:41:01 ucs22 cib: [4992]: info: cib_client_status_callback: Status
update: Client ucs26/cib now has status [online]
Dec 27 23:41:01 ucs22 cib: [4992]: info: crm_new_peer: Node 0 is now known
as ucs26
Dec 27 23:41:01 ucs22 cib: [4992]: info: crm_update_peer_proc: ucs26.cib is
now online
Dec 27 23:41:01 ucs22 crmd: [4996]: info: register_heartbeat_conn: Hostname:
ucs22
Dec 27 23:41:01 ucs22 crmd: [4996]: info: register_heartbeat_conn: UUID:
b8fc4074-c40e-48e4-80ad-a9b63fd4bf77
Dec 27 23:41:01 ucs22 crmd: [4996]: info: crm_cluster_connect: Connecting to
Heartbeat
Dec 27 23:41:02 ucs22 crmd: [4996]: info: do_ha_control: Connected to the
cluster
Dec 27 23:41:02 ucs22 crmd: [4996]: info: do_ccm_control: CCM connection
established... waiting for first callback
Dec 27 23:41:02 ucs22 crmd: [4996]: info: do_started: Delaying start, CCM
(0000000000100000) not connected
Dec 27 23:41:02 ucs22 cib: [4992]: ERROR: cib_process_request: Operation
ignored, cluster configuration is invalid. Please repair and restart: Update
does not conform to the configured schema/DTD
Dec 27 23:41:02 ucs22 crmd: [4996]: ERROR: config_query_callback: Local CIB
query resulted in an error: Update does not conform to the configured
schema/DTD
Dec 27 23:41:02 ucs22 crmd: [4996]: ERROR: config_query_callback: The
cluster is mis-configured - shutting down and staying down
Dec 27 23:41:02 ucs22 crmd: [4996]: notice: crmd_client_status_callback:
Status update: Client ucs22/crmd now has status [online] (DC=false)
Dec 27 23:41:02 ucs22 attrd: [4995]: info: cib_connect: Connected to the CIB
after 1 signon attempts
Dec 27 23:41:02 ucs22 attrd: [4995]: info: cib_connect: Sending full refresh
Dec 27 23:41:02 ucs22 crmd: [4996]: info: crm_new_peer: Node 0 is now known
as ucs22
Dec 27 23:41:02 ucs22 crmd: [4996]: info: crm_update_peer_proc: ucs22.crmd
is now online
Dec 27 23:41:02 ucs22 crmd: [4996]: info: crmd_client_status_callback: Not
the DC
Dec 27 23:41:02 ucs22 crmd: [4996]: notice: crmd_client_status_callback:
Status update: Client ucs22/crmd now has status [online] (DC=false)
Dec 27 23:41:03 ucs22 crmd: [4996]: info: crmd_client_status_callback: Not
the DC
Dec 27 23:41:03 ucs22 crmd: [4996]: notice: crmd_client_status_callback:
Status update: Client ucs26/crmd now has status [online] (DC=false)
Dec 27 23:41:03 ucs22 cib: [4992]: WARN: cib_peer_callback: Discarding
cib_apply_diff message (808) from ucs26: not in our membership
Dec 27 23:41:03 ucs22 crmd: [4996]: info: crm_new_peer: Node 0 is now known
as ucs26
Dec 27 23:41:03 ucs22 crmd: [4996]: info: crm_update_peer_proc: ucs26.crmd
is now online
Dec 27 23:41:03 ucs22 crmd: [4996]: info: crmd_client_status_callback: Not
the DC
Dec 27 23:41:03 ucs22 crmd: [4996]: ERROR: do_log: FSA: Input I_ERROR from
config_query_callback() received in state S_STARTING
Dec 27 23:41:03 ucs22 crmd: [4996]: info: do_state_transition: State
transition S_STARTING -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL
origin=config_query_callback ]
Dec 27 23:41:03 ucs22 crmd: [4996]: ERROR: do_recover: Action A_RECOVER
(0000000001000000) not supported
Dec 27 23:41:03 ucs22 crmd: [4996]: ERROR: do_log: FSA: Input I_ERROR from
revision_check_callback() received in state S_RECOVERY
Dec 27 23:41:03 ucs22 crmd: [4996]: info: do_dc_release: DC role released
Dec 27 23:41:03 ucs22 crmd: [4996]: info: do_te_control: Transitioner is now
inactive
Dec 27 23:41:03 ucs22 crmd: [4996]: ERROR: do_started: Start cancelled...
S_RECOVERY
Dec 27 23:41:03 ucs22 crmd: [4996]: ERROR: do_log: FSA: Input I_TERMINATE
from do_recover() received in state S_RECOVERY
Dec 27 23:41:03 ucs22 crmd: [4996]: info: do_state_transition: State
transition S_RECOVERY -> S_TERMINATE [ input=I_TERMINATE
cause=C_FSA_INTERNAL origin=do_recover ]
Dec 27 23:41:03 ucs22 crmd: [4996]: info: do_shutdown: All subsystems
stopped, continuing
Dec 27 23:41:03 ucs22 crmd: [4996]: info: do_lrm_control: Disconnected from
the LRM
Dec 27 23:41:03 ucs22 crmd: [4996]: info: do_ha_control: Disconnected from
Heartbeat
Dec 27 23:41:03 ucs22 ccm: [4991]: info: client (pid=4996) removed from ccm
Dec 27 23:41:03 ucs22 crmd: [4996]: info: do_cib_control: Disconnecting CIB
Dec 27 23:41:03 ucs22 crmd: [4996]: info: crmd_cib_connection_destroy:
Connection to the CIB terminated...
Dec 27 23:41:03 ucs22 cib: [4992]: ERROR: cib_process_request: Operation
ignored, cluster configuration is invalid. Please repair and restart: Update
does not conform to the configured schema/DTD
Dec 27 23:41:03 ucs22 crmd: [4996]: info: do_exit: Performing A_EXIT_0 -
gracefully exiting the CRMd
Dec 27 23:41:03 ucs22 cib: [4992]: WARN: send_ipc_message: IPC Channel to
4996 is not connected
Dec 27 23:41:03 ucs22 crmd: [4996]: ERROR: do_exit: Could not recover from
internal error
Dec 27 23:41:03 ucs22 cib: [4992]: WARN: send_via_callback_channel: Delivery
of reply to client 4996/78d426cb-9410-4d60-99fd-42fa190683c7 failed
Dec 27 23:41:03 ucs22 crmd: [4996]: WARN: do_exit: Inhibiting respawn by
Heartbeat
Dec 27 23:41:03 ucs22 cib: [4992]: WARN: do_local_notify: A-Sync reply to
crmd failed: reply failed
Dec 27 23:41:03 ucs22 crmd: [4996]: info: free_mem: Dropping
I_RELEASE_SUCCESS: [ state=S_TERMINATE cause=C_FSA_INTERNAL
origin=do_dc_release ]
Dec 27 23:41:03 ucs22 crmd: [4996]: info: free_mem: Dropping I_TERMINATE: [
state=S_TERMINATE cause=C_FSA_INTERNAL origin=do_stop ]
Dec 27 23:41:03 ucs22 crmd: [4996]: info: do_exit: [crmd] stopped (100)
Dec 27 23:41:04 ucs22 kernel: device eth2 entered promiscuous mode
Dec 27 23:41:05 ucs22 kernel: md: stopping all md devices.
Dec 27 23:41:06 ucs22 kernel: Synchronizing SCSI cache for disk sdb:
Dec 27 23:41:06 ucs22 kernel: Synchronizing SCSI cache for disk sda:
Dec 27 23:41:06 ucs22 kernel: ACPI: PCI interrupt for device 0000:12:00.1
disabled
Dec 27 23:41:06 ucs22 kernel: ACPI: PCI interrupt for device 0000:12:00.0
disabled
Dec 27 23:41:06 ucs22 kernel: ACPI: PCI interrupt for device 0000:08:00.1
disabled
Dec 27 23:41:06 ucs22 kernel: ACPI: PCI interrupt for device 0000:08:00.0
disabled
Dec 27 23:41:06 ucs22 kernel: usb 8-1: new full speed USB device using
uhci_hcd and address 2
Dec 27 23:41:06 ucs22 kernel: ACPI: PCI interrupt for device 0000:05:00.1
disabled
Dec 27 23:41:06 ucs22 kernel: usb 8-1: not running at top speed; connect to
a high speed hub
Dec 27 23:41:06 ucs22 kernel: usb 8-1: configuration #1 chosen from 1 choice
Dec 27 23:41:06 ucs22 kernel: hub 8-1:1.0: USB hub found
Dec 27 23:41:06 ucs22 kernel: hub 8-1:1.0: 4 ports detected
Dec 27 23:41:06 ucs22 kernel: ACPI: PCI interrupt for device 0000:05:00.0
disabled
Dec 27 23:47:35 ucs22 syslogd 1.4.1: restart.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to