I sent in a mail the other day and got what was happening completely
wrong. After more testing last night appear to have found the problem.
With a primary/secondary + secondary/primary drbd setup and nfs + IP
for each primary running on two nodes I've run into the following
problem.
If one node gets rebooted the failover works fine. If I just shut down
heartbeat it fails to unmount the filesystem and so drbd can't be
demoted.
I've attached the ha-debug log for the period where the failure
occurred. I've included the full section in case someone can spot
something related to the error I'm getting with the Filesystem.
Since NFS seems to be shutting down successfully I'm not sure how the
partition can be busy. The only thing I can think of is that the other
admin has got atop and zabbix monitoring, any chance those could be
keeping the drive from being unmounted?
I was also wondering about this section of the error " 789: ==:
unexpected operator". Does that refer to a specific line? And in which
script? I'd like to figure out what unexpected operator was received.
Thanks for any help anyone can provide.
Guy
Specific error:
lrmd[10472]: 2008/05/12_20:57:40 info: RA output:
(Filesystem_2:stop:stderr) [: 789: ==: unexpected operator
Filesystem[10359]: 2008/05/12_20:57:40 INFO: Trying to unmount
/virtual/store1
Filesystem[10359]: 2008/05/12_20:57:41 ERROR: Couldn't unmount
/virtual/store1; trying cleanup with SIGTERM
lrmd[10472]: 2008/05/12_20:57:41 info: RA output:
(Filesystem_2:stop:stderr) umount: /virtual/store1: device is busy
umount: /virtual/store1: device is busy
Full ha-debug log section:
heartbeat[10449]: 2008/05/12_20:57:36 debug: Process 10449 processing SIGTERM
heartbeat[10449]: 2008/05/12_20:57:36 debug: hb_initiate_shutdown() called.
heartbeat[10449]: 2008/05/12_20:57:36 debug: hb_initiate_shutdown():
calling hb_mcp_final_shutdown()
heartbeat[10449]: 2008/05/12_20:57:36 debug: hb_mcp_final_shutdown() phase 0
heartbeat[10449]: 2008/05/12_20:57:36 debug: Shutting down client
/usr/lib/heartbeat/mgmtd -v
heartbeat[10449]: 2008/05/12_20:57:36 info: killing
/usr/lib/heartbeat/mgmtd -v process group 10476 with signal 15
mgmtd[10476]: 2008/05/12_20:57:36 info: mgmtd is shutting down
mgmtd[10476]: 2008/05/12_20:57:36 debug: [mgmtd] stopped
heartbeat[10449]: 2008/05/12_20:57:36 debug:
G_remove_client(pid=10476, reason='died' gsource=0x717b28) {
heartbeat[10449]: 2008/05/12_20:57:36 debug: api_remove_client_int:
removing pid [10476] reason: died
heartbeat[10449]: 2008/05/12_20:57:36 debug: api_send_client: client 10476 died
heartbeat[10449]: 2008/05/12_20:57:36 debug: }/*G_remove_client;*/
heartbeat[10449]: 2008/05/12_20:57:36 debug: Shutting down client
/usr/lib/heartbeat/crmd
heartbeat[10449]: 2008/05/12_20:57:36 info: killing
/usr/lib/heartbeat/crmd process group 10475 with signal 15
crmd[10475]: 2008/05/12_20:57:36 info: crm_shutdown: Requesting shutdown
crmd[10475]: 2008/05/12_20:57:36 debug: register_fsa_input_adv:
crm_shutdown appended FSA input 10 (I_SHUTDOWN) (cause=C_SHUTDOWN)
without data
crmd[10475]: 2008/05/12_20:57:36 debug: crm_timer_start: Started
Shutdown Escalation (I_STOP:1200000ms), src=11
crmd[10475]: 2008/05/12_20:57:36 debug: s_crmd_fsa: Processing
I_SHUTDOWN: [ state=S_NOT_DC cause=C_SHUTDOWN origin=crm_shutdown ]
crmd[10475]: 2008/05/12_20:57:36 debug: do_fsa_action: actions:trace:
// A_SHUTDOWN_REQ
crmd[10475]: 2008/05/12_20:57:36 info: do_shutdown_req: Sending
shutdown request to DC: polar.aluminati.org
lrmd[10472]: 2008/05/12_20:57:36 debug: on_receive_cmd: the IPC to
client [pid:10476] disconnected.
lrmd[10472]: 2008/05/12_20:57:36 debug: unregister_client: client
mgmtd [pid:10476] is unregistered
crmd[10475]: 2008/05/12_20:57:37 debug: cancel_op: Cancelling op 63
for IPaddr_10_0_7_183 (IPaddr_10_0_7_183:63)
lrmd[10472]: 2008/05/12_20:57:37 debug: cancel_op: operation
monitor[63] on ocf::IPaddr::IPaddr_10_0_7_183 for client 10475, its
parameters: CRM_meta_interval=[5000] ip=[192.168.0.3]
CRM_meta_op_target_rc=[7] cidr_netmask=[24]
CRM_meta_id=[IPaddr_10_0_7_183_mon] CRM_meta_timeout=[5000]
crm_feature_set=[1.0.9] CRM_meta_name=[monitor] nic=[eth0] cancelled
crmd[10475]: 2008/05/12_20:57:37 info: do_lrm_rsc_op: Performing
op=IPaddr_10_0_7_183_stop_0
key=88:13:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:37 debug: on_msg_perform_op: add an
operation operation stop[64] on ocf::IPaddr::IPaddr_10_0_7_183 for
client 10475, its parameters: CRM_meta_timeout=[15000]
crm_feature_set=[1.0.9] to the operation list.
crmd[10475]: 2008/05/12_20:57:37 debug: do_lrm_rsc_op: Recording
pending op: 64 - IPaddr_10_0_7_183_stop_0 IPaddr_10_0_7_183:64
lrmd[10472]: 2008/05/12_20:57:37 info: RA output:
(IPaddr_10_0_7_183:stop:stdout) In IP Stop
lrmd[10472]: 2008/05/12_20:57:38 info: RA output:
(IPaddr_10_0_7_183:stop:stderr) SIOCDELRT: No such process
IPaddr[10162]: 2008/05/12_20:57:38 INFO: ifconfig eth0:0 down
crmd[10475]: 2008/05/12_20:57:38 debug: cancel_op: Cancelling op 23
for pingd-child:1 (pingd-child:1:23)
lrmd[10472]: 2008/05/12_20:57:38 info: Exiting IPaddr_10_0_7_183:stop
process 10162 returned rc 0.
lrmd[10472]: 2008/05/12_20:57:38 debug: cancel_op: operation
monitor[23] on ocf::pingd::pingd-child:1 for client 10475, its
parameters: multiplier=[100] CRM_meta_interval=[20000]
CRM_meta_prereq=[nothing] CRM_meta_op_target_rc=[7] dampen=[5s]
host=[10.0.7.180] CRM_meta_id=[pingd-child-monitor] name=[pingd]
CRM_meta_timeout=[60000] crm_feature_set=[1.0.9]
CRM_meta_clone_max=[2] CRM_meta_name=[monitor]
CRM_meta_globally_unique=[false] CRM_meta_clone=[1] cancelled
crmd[10475]: 2008/05/12_20:57:38 info: do_lrm_rsc_op: Performing
op=pingd-child:1_stop_0
key=16:13:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:38 debug: on_msg_perform_op: add an
operation operation stop[65] on ocf::pingd::pingd-child:1 for client
10475, its parameters: CRM_meta_timeout=[15000]
crm_feature_set=[1.0.9] CRM_meta_clone_max=[2]
CRM_meta_globally_unique=[false] CRM_meta_clone=[1] to the operation
list.
heartbeat[10449]: 2008/05/12_20:57:38 debug: EOF from client pid 10937
heartbeat[10449]: 2008/05/12_20:57:38 debug: Client pid 10937 died (input)
heartbeat[10449]: 2008/05/12_20:57:38 debug:
G_remove_client(pid=10937, reason='died' gsource=0x74d1c8) {
heartbeat[10449]: 2008/05/12_20:57:38 debug: api_remove_client_int:
removing pid [10937] reason: died
heartbeat[10449]: 2008/05/12_20:57:38 debug: api_send_client: client 10937 died
heartbeat[10449]: 2008/05/12_20:57:38 debug: }/*G_remove_client;*/
crmd[10475]: 2008/05/12_20:57:38 debug: do_lrm_rsc_op: Recording
pending op: 65 - pingd-child:1_stop_0 pingd-child:1:65
lrmd[10472]: 2008/05/12_20:57:38 info: Exiting pingd-child:1:stop
process 10183 returned rc 0.
crmd[10475]: 2008/05/12_20:57:38 info: do_lrm_rsc_op: Performing
op=drbd0:1_notify_0 key=124:13:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:38 debug: on_msg_perform_op: add an
operation operation notify[66] on ocf::drbd::drbd0:1 for client 10475,
its parameters: CRM_meta_notify_stop_resource=[drbd0:1 ]
CRM_meta_notify_operation=[demote]
CRM_meta_notify_stop_uname=[kodiak.aluminati.org ]
CRM_meta_notify_promote_resource=[drbd0:0 ] drbd_resource=[r0]
CRM_meta_notify_demote_uname=[kodiak.aluminati.org ]
CRM_meta_master_max=[1] CRM_meta_timeout=[15000]
CRM_meta_stateful=[true] crm_feature_set=[1.0.9]
CRM_meta_notify_demote_resource=[drbd0:1 ] CRM_meta_clone_max=[2]
CRM_meta_master_node_max=[1 to the operation list.
drbd[10190]: 2008/05/12_20:57:38 DEBUG: r0 notify: pre for demote -
counts: active 0 - starting 0 - stopping 1
crmd[10475]: 2008/05/12_20:57:38 debug: do_lrm_rsc_op: Recording
pending op: 66 - drbd0:1_notify_0 drbd0:1:66
lrmd[10472]: 2008/05/12_20:57:38 info: Exiting drbd0:1:notify process
10190 returned rc 0.
crmd[10475]: 2008/05/12_20:57:38 info: do_lrm_rsc_op: Performing
op=drbd1:1_notify_0 key=134:13:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:38 debug: on_msg_perform_op: add an
operation operation notify[67] on ocf::drbd::drbd1:1 for client 10475,
its parameters: CRM_meta_notify_stop_resource=[drbd1:1 ]
CRM_meta_notify_operation=[stop]
CRM_meta_notify_stop_uname=[kodiak.aluminati.org ] drbd_resource=[r1]
CRM_meta_notify_master_uname=[polar.aluminati.org ]
CRM_meta_notify_master_resource=[drbd1:0 ] CRM_meta_master_max=[1]
CRM_meta_timeout=[15000] CRM_meta_stateful=[true]
crm_feature_set=[1.0.9] CRM_meta_clone_max=[2]
CRM_meta_master_node_max=[1] CRM_meta_notify_confirm=[yes]
CRM_meta_notify to the operation list.
drbd[10203]: 2008/05/12_20:57:38 DEBUG: r1 notify: pre for stop -
counts: active 0 - starting 0 - stopping 1
crmd[10475]: 2008/05/12_20:57:38 debug: do_lrm_rsc_op: Recording
pending op: 67 - drbd1:1_notify_0 drbd1:1:67
lrmd[10472]: 2008/05/12_20:57:38 info: Exiting drbd1:1:notify process
10203 returned rc 0.
crmd[10475]: 2008/05/12_20:57:38 info: process_lrm_event: LRM
operation IPaddr_10_0_7_183_monitor_5000 (call=63, rc=-2) Cancelled
crmd[10475]: 2008/05/12_20:57:38 debug: delete_op_entry: async:
Sending delete op for IPaddr_10_0_7_183_monitor_5000 (call=63)
crmd[10475]: 2008/05/12_20:57:38 debug: process_lrm_event: Op
IPaddr_10_0_7_183_monitor_5000 (call=63): Confirmed
crmd[10475]: 2008/05/12_20:57:38 info: process_lrm_event: LRM
operation IPaddr_10_0_7_183_stop_0 (call=64, rc=0) complete
crmd[10475]: 2008/05/12_20:57:38 debug: do_update_resource: Sent
resource state update message: 84
crmd[10475]: 2008/05/12_20:57:38 debug: process_lrm_event: Op
IPaddr_10_0_7_183_stop_0 (call=64): Confirmed
crmd[10475]: 2008/05/12_20:57:38 info: process_lrm_event: LRM
operation pingd-child:1_monitor_20000 (call=23, rc=-2) Cancelled
crmd[10475]: 2008/05/12_20:57:38 debug: delete_op_entry: async:
Sending delete op for pingd-child:1_monitor_20000 (call=23)
crmd[10475]: 2008/05/12_20:57:38 debug: process_lrm_event: Op
pingd-child:1_monitor_20000 (call=23): Confirmed
crmd[10475]: 2008/05/12_20:57:38 info: process_lrm_event: LRM
operation pingd-child:1_stop_0 (call=65, rc=0) complete
crmd[10475]: 2008/05/12_20:57:38 debug: do_update_resource: Sent
resource state update message: 86
crmd[10475]: 2008/05/12_20:57:38 debug: process_lrm_event: Op
pingd-child:1_stop_0 (call=65): Confirmed
crmd[10475]: 2008/05/12_20:57:38 info: process_lrm_event: LRM
operation drbd0:1_notify_0 (call=66, rc=0) complete
crmd[10475]: 2008/05/12_20:57:38 debug: do_update_resource: Sent
resource state update message: 87
crmd[10475]: 2008/05/12_20:57:38 debug: process_lrm_event: Op
drbd0:1_notify_0 (call=66): Confirmed
crmd[10475]: 2008/05/12_20:57:38 info: process_lrm_event: LRM
operation drbd1:1_notify_0 (call=67, rc=0) complete
crmd[10475]: 2008/05/12_20:57:38 debug: do_update_resource: Sent
resource state update message: 88
crmd[10475]: 2008/05/12_20:57:38 debug: process_lrm_event: Op
drbd1:1_notify_0 (call=67): Confirmed
crmd[10475]: 2008/05/12_20:57:39 debug: cancel_op: Cancelling op 61
for nfs-kernel-server_5 (nfs-kernel-server_5:61)
lrmd[10472]: 2008/05/12_20:57:39 debug: cancel_op: operation
monitor[61] on lsb::nfs-kernel-server::nfs-kernel-server_5 for client
10475, its parameters: CRM_meta_interval=[120000]
CRM_meta_op_target_rc=[7] CRM_meta_id=[nfs-kernel-server_5_mon]
CRM_meta_timeout=[60000] crm_feature_set=[1.0.9]
CRM_meta_name=[monitor] cancelled
crmd[10475]: 2008/05/12_20:57:39 info: do_lrm_rsc_op: Performing
op=nfs-kernel-server_5_stop_0
key=85:13:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:39 debug: on_msg_perform_op: add an
operation operation stop[68] on
lsb::nfs-kernel-server::nfs-kernel-server_5 for client 10475, its
parameters: CRM_meta_timeout=[15000] crm_feature_set=[1.0.9] to the
operation list.
lrmd[10216]: 2008/05/12_20:57:39 WARN: For LSB init script, no
additional parameters are needed.
crmd[10475]: 2008/05/12_20:57:39 debug: do_lrm_rsc_op: Recording
pending op: 68 - nfs-kernel-server_5_stop_0 nfs-kernel-server_5:68
lrmd[10472]: 2008/05/12_20:57:39 info: RA output:
(nfs-kernel-server_5:stop:stdout) * Stopping NFS kernel daemon
...done.
* Unexporting directories for NFS kernel daemon...
crmd[10475]: 2008/05/12_20:57:39 info: do_lrm_rsc_op: Performing
op=drbd1:1_stop_0 key=54:13:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:39 info: Exiting
nfs-kernel-server_5:stop process 10216 returned rc 0.
lrmd[10472]: 2008/05/12_20:57:39 info: RA output:
(nfs-kernel-server_5:stop:stdout) ...done.
lrmd[10472]: 2008/05/12_20:57:39 debug: on_msg_perform_op: add an
operation operation stop[69] on ocf::drbd::drbd1:1 for client 10475,
its parameters: CRM_meta_notify_stop_resource=[drbd1:1 ]
CRM_meta_notify_stop_uname=[kodiak.aluminati.org ]
CRM_meta_notify_master_uname=[polar.aluminati.org ]
CRM_meta_notify_master_resource=[drbd1:0 ] CRM_meta_master_max=[1]
CRM_meta_timeout=[15000] crm_feature_set=[1.0.9]
CRM_meta_clone_max=[2] CRM_meta_master_node_max=[1]
CRM_meta_globally_unique=[false] CRM_meta_clone=[1]
CRM_meta_clone_node_max=[1] to the operation list.
drbd[10226]: 2008/05/12_20:57:39 DEBUG: r1: Calling drbdadm -c
/etc/drbd.conf state r1
crmd[10475]: 2008/05/12_20:57:39 debug: do_lrm_rsc_op: Recording
pending op: 69 - drbd1:1_stop_0 drbd1:1:69
crmd[10475]: 2008/05/12_20:57:39 debug: cib_rsc_callback: Resource
update 84 complete
crmd[10475]: 2008/05/12_20:57:39 debug: cib_rsc_callback: Resource
update 86 complete
crmd[10475]: 2008/05/12_20:57:39 debug: cib_rsc_callback: Resource
update 87 complete
crmd[10475]: 2008/05/12_20:57:39 debug: cib_rsc_callback: Resource
update 88 complete
crmd[10475]: 2008/05/12_20:57:39 info: process_lrm_event: LRM
operation nfs-kernel-server_5_monitor_120000 (call=61, rc=-2)
Cancelled
crmd[10475]: 2008/05/12_20:57:39 debug: delete_op_entry: async:
Sending delete op for nfs-kernel-server_5_monitor_120000 (call=61)
crmd[10475]: 2008/05/12_20:57:39 debug: process_lrm_event: Op
nfs-kernel-server_5_monitor_120000 (call=61): Confirmed
crmd[10475]: 2008/05/12_20:57:39 info: process_lrm_event: LRM
operation nfs-kernel-server_5_stop_0 (call=68, rc=0) complete
crmd[10475]: 2008/05/12_20:57:39 debug: do_update_resource: Sent
resource state update message: 90
crmd[10475]: 2008/05/12_20:57:39 debug: process_lrm_event: Op
nfs-kernel-server_5_stop_0 (call=68): Confirmed
drbd[10226]: 2008/05/12_20:57:39 DEBUG: r1: Exit code 0
drbd[10226]: 2008/05/12_20:57:39 DEBUG: r1: Command output: Secondary/Primary
drbd[10226]: 2008/05/12_20:57:39 DEBUG: r1: Calling drbdadm -c
/etc/drbd.conf cstate r1
drbd[10226]: 2008/05/12_20:57:39 DEBUG: r1: Exit code 0
drbd[10226]: 2008/05/12_20:57:39 DEBUG: r1: Command output: Connected
drbd[10226]: 2008/05/12_20:57:39 DEBUG: r1 status:
Secondary/Primary Secondary Primary Connected
drbd[10226]: 2008/05/12_20:57:39 DEBUG: r1: Calling /usr/sbin/crm_master -D
crm_master[10278]: 2008/05/12_20:57:39 debug: crm_set_env_options:
HA_use_logd = (null)
crm_master[10278]: 2008/05/12_20:57:39 debug: crm_set_env_options:
HA_conn_logd_time = 60
crm_master[10278]: 2008/05/12_20:57:39 info: Invoked: /usr/sbin/crm_master -D
crmd[10475]: 2008/05/12_20:57:39 debug: cancel_op: Cancelling op 59
for nfs-common_4 (nfs-common_4:59)
lrmd[10472]: 2008/05/12_20:57:39 debug: cancel_op: operation
monitor[59] on lsb::nfs-common::nfs-common_4 for client 10475, its
parameters: CRM_meta_interval=[120000] CRM_meta_op_target_rc=[7]
CRM_meta_id=[nfs-common_4_mon] CRM_meta_timeout=[60000]
crm_feature_set=[1.0.9] CRM_meta_name=[monitor] cancelled
crmd[10475]: 2008/05/12_20:57:39 info: do_lrm_rsc_op: Performing
op=nfs-common_4_stop_0 key=82:13:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:39 debug: on_msg_perform_op: add an
operation operation stop[70] on lsb::nfs-common::nfs-common_4 for
client 10475, its parameters: CRM_meta_timeout=[15000]
crm_feature_set=[1.0.9] to the operation list.
lrmd[10279]: 2008/05/12_20:57:39 WARN: For LSB init script, no
additional parameters are needed.
crmd[10475]: 2008/05/12_20:57:39 debug: do_lrm_rsc_op: Recording
pending op: 70 - nfs-common_4_stop_0 nfs-common_4:70
crmd[10475]: 2008/05/12_20:57:39 debug: cib_rsc_callback: Resource
update 90 complete
crmd[10475]: 2008/05/12_20:57:39 info: process_lrm_event: LRM
operation nfs-common_4_monitor_120000 (call=59, rc=-2) Cancelled
crmd[10475]: 2008/05/12_20:57:39 debug: delete_op_entry: async:
Sending delete op for nfs-common_4_monitor_120000 (call=59)
crmd[10475]: 2008/05/12_20:57:39 debug: process_lrm_event: Op
nfs-common_4_monitor_120000 (call=59): Confirmed
lrmd[10472]: 2008/05/12_20:57:39 info: RA output:
(nfs-common_4:stop:stdout) * Stopping NFS common utilities
lrmd[10472]: 2008/05/12_20:57:40 info: Exiting nfs-common_4:stop
process 10279 returned rc 0.
lrmd[10472]: 2008/05/12_20:57:40 info: RA output:
(nfs-common_4:stop:stdout) ...done.
crmd[10475]: 2008/05/12_20:57:40 info: process_lrm_event: LRM
operation nfs-common_4_stop_0 (call=70, rc=0) complete
crmd[10475]: 2008/05/12_20:57:40 debug: do_update_resource: Sent
resource state update message: 92
crmd[10475]: 2008/05/12_20:57:40 debug: process_lrm_event: Op
nfs-common_4_stop_0 (call=70): Confirmed
lrmd[10472]: 2008/05/12_20:57:40 info: RA output:
(drbd1:1:stop:stderr) local: 586: 2008/05/12_20:57:39: bad variable
name
drbd[10226]: 2008/05/12_20:57:40 ERROR: r1: Called /usr/sbin/crm_master -D
drbd[10226]: 2008/05/12_20:57:40 ERROR: r1: Exit code 2
drbd[10226]: 2008/05/12_20:57:40 ERROR: r1: Command output:
crm_master[10278]:
drbd[10226]: 2008/05/12_20:57:40 DEBUG: r1: Calling drbdadm -c
/etc/drbd.conf down r1
lrmd[10472]: 2008/05/12_20:57:40 info: RA output:
(drbd1:1:stop:stdout) crm_master[10278]:
drbd[10226]: 2008/05/12_20:57:40 DEBUG: r1: Exit code 0
drbd[10226]: 2008/05/12_20:57:40 DEBUG: r1: Command output:
drbd[10226]: 2008/05/12_20:57:40 DEBUG: r1 stop: drbdadm down succeeded.
drbd[10226]: 2008/05/12_20:57:40 DEBUG: r1: Calling drbdadm -c
/etc/drbd.conf state r1
drbd[10226]: 2008/05/12_20:57:40 DEBUG: r1: Exit code 0
drbd[10226]: 2008/05/12_20:57:40 DEBUG: r1: Command output: Unconfigured
lrmd[10472]: 2008/05/12_20:57:40 info: RA output: (drbd1:1:stop:stdout)
drbd[10226]: 2008/05/12_20:57:40 DEBUG: r1: Calling drbdadm -c
/etc/drbd.conf cstate r1
drbd[10226]: 2008/05/12_20:57:40 DEBUG: r1: Exit code 0
drbd[10226]: 2008/05/12_20:57:40 DEBUG: r1: Command output: Unconfigured
drbd[10226]: 2008/05/12_20:57:40 DEBUG: r1 status: Unconfigured Not
configured Not configured Unconfigured
lrmd[10472]: 2008/05/12_20:57:40 info: Exiting drbd1:1:stop process
10226 returned rc 0.
crmd[10475]: 2008/05/12_20:57:40 info: process_lrm_event: LRM
operation drbd1:1_stop_0 (call=69, rc=0) complete
crmd[10475]: 2008/05/12_20:57:40 debug: do_update_resource: Sent
resource state update message: 93
crmd[10475]: 2008/05/12_20:57:40 debug: process_lrm_event: Op
drbd1:1_stop_0 (call=69): Confirmed
crmd[10475]: 2008/05/12_20:57:40 debug: cancel_op: Cancelling op 57
for Filesystem_2 (Filesystem_2:57)
lrmd[10472]: 2008/05/12_20:57:40 debug: cancel_op: operation
monitor[57] on ocf::Filesystem::Filesystem_2 for client 10475, its
parameters: directory=[/virtual/store1] fstype=[reiserfs]
CRM_meta_interval=[120000] CRM_meta_op_target_rc=[7]
device=[/dev/drbd0] CRM_meta_id=[Filesystem_2_mon]
CRM_meta_timeout=[60000] crm_feature_set=[1.0.9]
CRM_meta_name=[monitor] cancelled
crmd[10475]: 2008/05/12_20:57:40 info: do_lrm_rsc_op: Performing
op=Filesystem_2_stop_0 key=79:13:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:40 debug: on_msg_perform_op: add an
operation operation stop[71] on ocf::Filesystem::Filesystem_2 for
client 10475, its parameters: CRM_meta_timeout=[15000]
crm_feature_set=[1.0.9] to the operation list.
crmd[10475]: 2008/05/12_20:57:40 debug: do_lrm_rsc_op: Recording
pending op: 71 - Filesystem_2_stop_0 Filesystem_2:71
crmd[10475]: 2008/05/12_20:57:40 debug: cib_rsc_callback: Resource
update 92 complete
crmd[10475]: 2008/05/12_20:57:40 info: process_lrm_event: LRM
operation Filesystem_2_monitor_120000 (call=57, rc=-2) Cancelled
crmd[10475]: 2008/05/12_20:57:40 debug: delete_op_entry: async:
Sending delete op for Filesystem_2_monitor_120000 (call=57)
crmd[10475]: 2008/05/12_20:57:40 debug: process_lrm_event: Op
Filesystem_2_monitor_120000 (call=57): Confirmed
Filesystem[10359]: 2008/05/12_20:57:40 INFO: Running stop for
/dev/drbd0 on /virtual/store1
lrmd[10472]: 2008/05/12_20:57:40 info: RA output:
(Filesystem_2:stop:stderr) [: 789: ==: unexpected operator
Filesystem[10359]: 2008/05/12_20:57:40 INFO: Trying to unmount
/virtual/store1
Filesystem[10359]: 2008/05/12_20:57:41 ERROR: Couldn't unmount
/virtual/store1; trying cleanup with SIGTERM
lrmd[10472]: 2008/05/12_20:57:41 info: RA output:
(Filesystem_2:stop:stderr) umount: /virtual/store1: device is busy
umount: /virtual/store1: device is busy
Filesystem[10359]: 2008/05/12_20:57:41 INFO: No processes on
/virtual/store1 were signalled
crmd[10475]: 2008/05/12_20:57:41 debug: cib_rsc_callback: Resource
update 93 complete
lrmd[10472]: 2008/05/12_20:57:42 info: RA output:
(Filesystem_2:stop:stderr) umount: /virtual/store1: device is busy
umount: /virtual/store1: device is busy
Filesystem[10359]: 2008/05/12_20:57:42 ERROR: Couldn't unmount
/virtual/store1; trying cleanup with SIGTERM
Filesystem[10359]: 2008/05/12_20:57:42 INFO: No processes on
/virtual/store1 were signalled
lrmd[10472]: 2008/05/12_20:57:43 info: RA output:
(Filesystem_2:stop:stderr) umount: /virtual/store1: device is busy
umount: /virtual/store1: device is busy
Filesystem[10359]: 2008/05/12_20:57:43 ERROR: Couldn't unmount
/virtual/store1; trying cleanup with SIGTERM
Filesystem[10359]: 2008/05/12_20:57:43 INFO: No processes on
/virtual/store1 were signalled
lrmd[10472]: 2008/05/12_20:57:44 info: RA output:
(Filesystem_2:stop:stderr) umount: /virtual/store1: device is busy
umount: /virtual/store1: device is busy
Filesystem[10359]: 2008/05/12_20:57:44 ERROR: Couldn't unmount
/virtual/store1; trying cleanup with SIGKILL
Filesystem[10359]: 2008/05/12_20:57:44 INFO: No processes on
/virtual/store1 were signalled
lrmd[10472]: 2008/05/12_20:57:45 info: RA output:
(Filesystem_2:stop:stderr) umount: /virtual/store1: device is busy
umount: /virtual/store1: device is busy
Filesystem[10359]: 2008/05/12_20:57:45 ERROR: Couldn't unmount
/virtual/store1; trying cleanup with SIGKILL
Filesystem[10359]: 2008/05/12_20:57:45 INFO: No processes on
/virtual/store1 were signalled
lrmd[10472]: 2008/05/12_20:57:46 info: RA output:
(Filesystem_2:stop:stderr) umount: /virtual/store1: device is busy
umount: /virtual/store1: device is busy
Filesystem[10359]: 2008/05/12_20:57:46 ERROR: Couldn't unmount
/virtual/store1; trying cleanup with SIGKILL
Filesystem[10359]: 2008/05/12_20:57:46 INFO: No processes on
/virtual/store1 were signalled
Filesystem[10359]: 2008/05/12_20:57:47 ERROR: Couldn't unmount
/virtual/store1, giving up!
lrmd[10472]: 2008/05/12_20:57:47 WARN: Exiting Filesystem_2:stop
process 10359 returned rc 1.
crmd[10475]: 2008/05/12_20:57:47 ERROR: process_lrm_event: LRM
operation Filesystem_2_stop_0 (call=71, rc=1) Error unknown error
crmd[10475]: 2008/05/12_20:57:47 debug: do_update_resource: Sent
resource state update message: 95
crmd[10475]: 2008/05/12_20:57:47 debug: process_lrm_event: Op
Filesystem_2_stop_0 (call=71): Confirmed
crmd[10475]: 2008/05/12_20:57:48 info: do_lrm_rsc_op: Performing
op=pingd-child:0_monitor_0
key=9:14:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:48 debug: on_msg_perform_op: add an
operation operation monitor[72] on ocf::pingd::pingd-child:0 for
client 10475, its parameters: multiplier=[100]
CRM_meta_op_target_rc=[7] dampen=[5s] host=[10.0.7.180] name=[pingd]
CRM_meta_timeout=[15000] crm_feature_set=[1.0.9]
CRM_meta_clone_max=[2] CRM_meta_globally_unique=[false]
CRM_meta_clone=[0] to the operation list.
crmd[10475]: 2008/05/12_20:57:48 debug: do_lrm_rsc_op: Recording
pending op: 72 - pingd-child:0_monitor_0 pingd-child:0:72
lrmd[10472]: 2008/05/12_20:57:48 WARN: Exiting pingd-child:0:monitor
process 10526 returned rc 7.
crmd[10475]: 2008/05/12_20:57:48 info: do_lrm_rsc_op: Performing
op=drbd1:0_monitor_0 key=10:14:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:49 debug: on_msg_perform_op: add an
operation operation monitor[73] on ocf::drbd::drbd1:0 for client
10475, its parameters: CRM_meta_role=[Master]
CRM_meta_op_target_rc=[7] drbd_resource=[r1] CRM_meta_master_max=[1]
CRM_meta_stateful=[true] CRM_meta_timeout=[15000]
crm_feature_set=[1.0.9] CRM_meta_clone_max=[2]
CRM_meta_master_node_max=[1] CRM_meta_globally_unique=[false]
CRM_meta_clone=[0] CRM_meta_clone_node_max=[1] to the operation list.
drbd[10531]: 2008/05/12_20:57:49 DEBUG: r1: Calling drbdadm -c
/etc/drbd.conf state r1
crmd[10475]: 2008/05/12_20:57:49 debug: do_lrm_rsc_op: Recording
pending op: 73 - drbd1:0_monitor_0 drbd1:0:73
drbd[10531]: 2008/05/12_20:57:49 DEBUG: r1: Exit code 0
drbd[10531]: 2008/05/12_20:57:49 DEBUG: r1: Command output: Unconfigured
drbd[10531]: 2008/05/12_20:57:49 DEBUG: r1: Calling drbdadm -c
/etc/drbd.conf cstate r1
drbd[10531]: 2008/05/12_20:57:49 DEBUG: r1: Exit code 0
drbd[10531]: 2008/05/12_20:57:49 DEBUG: r1: Command output: Unconfigured
drbd[10531]: 2008/05/12_20:57:49 DEBUG: r1 status: Unconfigured Not
configured Not configured Unconfigured
drbd[10531]: 2008/05/12_20:57:49 DEBUG: r1 monitor: resource not configured
crmd[10475]: 2008/05/12_20:57:49 info: do_lrm_rsc_op: Performing
op=drbd0:1_notify_0 key=120:14:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:49 WARN: Exiting drbd1:0:monitor process
10531 returned rc 7.
lrmd[10472]: 2008/05/12_20:57:49 debug: on_msg_perform_op: add an
operation operation notify[74] on ocf::drbd::drbd0:1 for client 10475,
its parameters: CRM_meta_notify_stop_resource=[drbd0:1 ]
CRM_meta_notify_operation=[demote]
CRM_meta_notify_stop_uname=[kodiak.aluminati.org ]
CRM_meta_notify_promote_resource=[drbd0:0 ] drbd_resource=[r0]
CRM_meta_notify_demote_uname=[kodiak.aluminati.org ]
CRM_meta_master_max=[1] CRM_meta_timeout=[15000]
CRM_meta_stateful=[true] crm_feature_set=[1.0.9]
CRM_meta_notify_demote_resource=[drbd0:1 ] CRM_meta_clone_max=[2]
CRM_meta_master_node_max=[1 to the operation list.
drbd[10583]: 2008/05/12_20:57:49 DEBUG: r0 notify: pre for demote -
counts: active 0 - starting 0 - stopping 1
crmd[10475]: 2008/05/12_20:57:49 debug: do_lrm_rsc_op: Recording
pending op: 74 - drbd0:1_notify_0 drbd0:1:74
lrmd[10472]: 2008/05/12_20:57:49 info: Exiting drbd0:1:notify process
10583 returned rc 0.
crmd[10475]: 2008/05/12_20:57:49 debug: cib_rsc_callback: Resource
update 95 complete
crmd[10475]: 2008/05/12_20:57:49 info: process_lrm_event: LRM
operation pingd-child:0_monitor_0 (call=72, rc=7) complete
crmd[10475]: 2008/05/12_20:57:49 debug: do_update_resource: Sent
resource state update message: 96
crmd[10475]: 2008/05/12_20:57:49 debug: process_lrm_event: Op
pingd-child:0_monitor_0 (call=72): Confirmed
crmd[10475]: 2008/05/12_20:57:49 info: process_lrm_event: LRM
operation drbd1:0_monitor_0 (call=73, rc=7) complete
crmd[10475]: 2008/05/12_20:57:49 debug: do_update_resource: Sent
resource state update message: 97
crmd[10475]: 2008/05/12_20:57:49 debug: process_lrm_event: Op
drbd1:0_monitor_0 (call=73): Confirmed
crmd[10475]: 2008/05/12_20:57:49 info: process_lrm_event: LRM
operation drbd0:1_notify_0 (call=74, rc=0) complete
crmd[10475]: 2008/05/12_20:57:49 debug: do_update_resource: Sent
resource state update message: 98
crmd[10475]: 2008/05/12_20:57:49 debug: process_lrm_event: Op
drbd0:1_notify_0 (call=74): Confirmed
crmd[10475]: 2008/05/12_20:57:50 info: do_lrm_rsc_op: Performing
op=drbd0:1_demote_0 key=20:14:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:50 debug: on_msg_perform_op: add an
operation operation demote[75] on ocf::drbd::drbd0:1 for client 10475,
its parameters: CRM_meta_notify_stop_resource=[drbd0:1 ]
CRM_meta_notify_stop_uname=[kodiak.aluminati.org ]
CRM_meta_notify_promote_resource=[drbd0:0 ] drbd_resource=[r0]
CRM_meta_notify_demote_uname=[kodiak.aluminati.org ]
CRM_meta_master_max=[1] CRM_meta_stateful=[true]
CRM_meta_timeout=[15000] crm_feature_set=[1.0.9]
CRM_meta_notify_demote_resource=[drbd0:1 ] CRM_meta_clone_max=[2]
CRM_meta_master_node_max=[1] CRM_meta_notify_promote_uname=[po to the
operation list.
drbd[10596]: 2008/05/12_20:57:50 DEBUG: r0: Calling drbdadm -c
/etc/drbd.conf secondary r0
crmd[10475]: 2008/05/12_20:57:50 debug: do_lrm_rsc_op: Recording
pending op: 75 - drbd0:1_demote_0 drbd0:1:75
crmd[10475]: 2008/05/12_20:57:50 debug: cib_rsc_callback: Resource
update 96 complete
crmd[10475]: 2008/05/12_20:57:50 debug: cib_rsc_callback: Resource
update 97 complete
crmd[10475]: 2008/05/12_20:57:50 debug: cib_rsc_callback: Resource
update 98 complete
drbd[10596]: 2008/05/12_20:57:50 ERROR: r0: Called drbdadm -c
/etc/drbd.conf secondary r0
drbd[10596]: 2008/05/12_20:57:50 ERROR: r0: Exit code 2
drbd[10596]: 2008/05/12_20:57:50 ERROR: r0: Command output: State
drbd[10596]: 2008/05/12_20:57:50 ERROR: r0 demote: Failed with exit code 2.
lrmd[10472]: 2008/05/12_20:57:50 WARN: Exiting drbd0:1:demote process
10596 returned rc 1.
lrmd[10472]: 2008/05/12_20:57:50 info: RA output: (drbd0:1:demote:stdout) State
lrmd[10472]: 2008/05/12_20:57:50 info: RA output:
(drbd0:1:demote:stderr) local: 586: failed:: bad variable name
crmd[10475]: 2008/05/12_20:57:50 ERROR: process_lrm_event: LRM
operation drbd0:1_demote_0 (call=75, rc=1) Error unknown error
crmd[10475]: 2008/05/12_20:57:50 debug: do_update_resource: Sent
resource state update message: 101
crmd[10475]: 2008/05/12_20:57:50 debug: process_lrm_event: Op
drbd0:1_demote_0 (call=75): Confirmed
crmd[10475]: 2008/05/12_20:57:50 info: do_lrm_rsc_op: Performing
op=pingd-child:0_monitor_0
key=10:15:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:50 debug: on_msg_perform_op: add an
operation operation monitor[76] on ocf::pingd::pingd-child:0 for
client 10475, its parameters: multiplier=[100]
CRM_meta_op_target_rc=[7] dampen=[5s] host=[10.0.7.180] name=[pingd]
CRM_meta_timeout=[15000] crm_feature_set=[1.0.9]
CRM_meta_clone_max=[2] CRM_meta_globally_unique=[false]
CRM_meta_clone=[0] to the operation list.
crmd[10475]: 2008/05/12_20:57:50 debug: do_lrm_rsc_op: Recording
pending op: 76 - pingd-child:0_monitor_0 pingd-child:0:76
lrmd[10472]: 2008/05/12_20:57:50 WARN: Exiting pingd-child:0:monitor
process 10634 returned rc 7.
crmd[10475]: 2008/05/12_20:57:50 info: do_lrm_rsc_op: Performing
op=drbd1:0_monitor_0 key=11:15:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:51 debug: on_msg_perform_op: add an
operation operation monitor[77] on ocf::drbd::drbd1:0 for client
10475, its parameters: CRM_meta_role=[Master]
CRM_meta_op_target_rc=[7] drbd_resource=[r1] CRM_meta_master_max=[1]
CRM_meta_stateful=[true] CRM_meta_timeout=[15000]
crm_feature_set=[1.0.9] CRM_meta_clone_max=[2]
CRM_meta_master_node_max=[1] CRM_meta_globally_unique=[false]
CRM_meta_clone=[0] CRM_meta_clone_node_max=[1] to the operation list.
drbd[10639]: 2008/05/12_20:57:51 DEBUG: r1: Calling drbdadm -c
/etc/drbd.conf state r1
crmd[10475]: 2008/05/12_20:57:51 debug: do_lrm_rsc_op: Recording
pending op: 77 - drbd1:0_monitor_0 drbd1:0:77
crmd[10475]: 2008/05/12_20:57:51 debug: cib_rsc_callback: Resource
update 101 complete
crmd[10475]: 2008/05/12_20:57:51 info: process_lrm_event: LRM
operation pingd-child:0_monitor_0 (call=76, rc=7) complete
drbd[10639]: 2008/05/12_20:57:51 DEBUG: r1: Exit code 0
drbd[10639]: 2008/05/12_20:57:51 DEBUG: r1: Command output: Unconfigured
drbd[10639]: 2008/05/12_20:57:51 DEBUG: r1: Calling drbdadm -c
/etc/drbd.conf cstate r1
drbd[10639]: 2008/05/12_20:57:51 DEBUG: r1: Exit code 0
drbd[10639]: 2008/05/12_20:57:51 DEBUG: r1: Command output: Unconfigured
drbd[10639]: 2008/05/12_20:57:51 DEBUG: r1 status: Unconfigured Not
configured Not configured Unconfigured
drbd[10639]: 2008/05/12_20:57:51 DEBUG: r1 monitor: resource not configured
lrmd[10472]: 2008/05/12_20:57:51 WARN: Exiting drbd1:0:monitor process
10639 returned rc 7.
crmd[10475]: 2008/05/12_20:57:51 debug: do_update_resource: Sent
resource state update message: 102
crmd[10475]: 2008/05/12_20:57:51 debug: process_lrm_event: Op
pingd-child:0_monitor_0 (call=76): Confirmed
crmd[10475]: 2008/05/12_20:57:51 info: process_lrm_event: LRM
operation drbd1:0_monitor_0 (call=77, rc=7) complete
crmd[10475]: 2008/05/12_20:57:51 debug: do_update_resource: Sent
resource state update message: 103
crmd[10475]: 2008/05/12_20:57:51 debug: process_lrm_event: Op
drbd1:0_monitor_0 (call=77): Confirmed
crmd[10475]: 2008/05/12_20:57:51 info: do_lrm_rsc_op: Performing
op=drbd0:1_notify_0 key=120:15:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:51 debug: on_msg_perform_op: add an
operation operation notify[78] on ocf::drbd::drbd0:1 for client 10475,
its parameters: CRM_meta_notify_stop_resource=[drbd0:1 ]
CRM_meta_notify_operation=[demote]
CRM_meta_notify_stop_uname=[kodiak.aluminati.org ]
CRM_meta_notify_promote_resource=[drbd0:0 ] drbd_resource=[r0]
CRM_meta_notify_demote_uname=[kodiak.aluminati.org ]
CRM_meta_master_max=[1] CRM_meta_timeout=[15000]
CRM_meta_stateful=[true] crm_feature_set=[1.0.9]
CRM_meta_notify_demote_resource=[drbd0:1 ] CRM_meta_clone_max=[2]
CRM_meta_master_node_max=[1 to the operation list.
drbd[10691]: 2008/05/12_20:57:51 DEBUG: r0 notify: pre for demote -
counts: active 0 - starting 0 - stopping 1
crmd[10475]: 2008/05/12_20:57:51 debug: do_lrm_rsc_op: Recording
pending op: 78 - drbd0:1_notify_0 drbd0:1:78
lrmd[10472]: 2008/05/12_20:57:51 info: Exiting drbd0:1:notify process
10691 returned rc 0.
crmd[10475]: 2008/05/12_20:57:51 info: process_lrm_event: LRM
operation drbd0:1_notify_0 (call=78, rc=0) complete
crmd[10475]: 2008/05/12_20:57:51 debug: do_update_resource: Sent
resource state update message: 104
crmd[10475]: 2008/05/12_20:57:51 debug: process_lrm_event: Op
drbd0:1_notify_0 (call=78): Confirmed
crmd[10475]: 2008/05/12_20:57:51 debug: cib_rsc_callback: Resource
update 102 complete
crmd[10475]: 2008/05/12_20:57:51 debug: cib_rsc_callback: Resource
update 103 complete
crmd[10475]: 2008/05/12_20:57:52 info: do_lrm_rsc_op: Performing
op=drbd0:1_demote_0 key=21:15:b5f0c4ea-1dd2-4686-b2ee-30daefa07fd5)
lrmd[10472]: 2008/05/12_20:57:52 debug: on_msg_perform_op: add an
operation operation demote[79] on ocf::drbd::drbd0:1 for client 10475,
its parameters: CRM_meta_notify_stop_resource=[drbd0:1 ]
CRM_meta_notify_stop_uname=[kodiak.aluminati.org ]
CRM_meta_notify_promote_resource=[drbd0:0 ] drbd_resource=[r0]
CRM_meta_notify_demote_uname=[kodiak.aluminati.org ]
CRM_meta_master_max=[1] CRM_meta_stateful=[true]
CRM_meta_timeout=[15000] crm_feature_set=[1.0.9]
CRM_meta_notify_demote_resource=[drbd0:1 ] CRM_meta_clone_max=[2]
CRM_meta_master_node_max=[1] CRM_meta_notify_promote_uname=[po to the
operation list.
drbd[10704]: 2008/05/12_20:57:52 DEBUG: r0: Calling drbdadm -c
/etc/drbd.conf secondary r0
crmd[10475]: 2008/05/12_20:57:52 debug: do_lrm_rsc_op: Recording
pending op: 79 - drbd0:1_demote_0 drbd0:1:79
crmd[10475]: 2008/05/12_20:57:52 debug: cib_rsc_callback: Resource
update 104 complete
drbd[10704]: 2008/05/12_20:57:52 ERROR: r0: Called drbdadm -c
/etc/drbd.conf secondary r0
drbd[10704]: 2008/05/12_20:57:52 ERROR: r0: Exit code 2
drbd[10704]: 2008/05/12_20:57:52 ERROR: r0: Command output: State
drbd[10704]: 2008/05/12_20:57:52 ERROR: r0 demote: Failed with exit code 2.
lrmd[10472]: 2008/05/12_20:57:52 WARN: Exiting drbd0:1:demote process
10704 returned rc 1.
lrmd[10472]: 2008/05/12_20:57:52 info: RA output: (drbd0:1:demote:stdout) State
lrmd[10472]: 2008/05/12_20:57:52 info: RA output:
(drbd0:1:demote:stderr) local: 586: failed:: bad variable name
crmd[10475]: 2008/05/12_20:57:52 ERROR: process_lrm_event: LRM
operation drbd0:1_demote_0 (call=79, rc=1) Error unknown error
crmd[10475]: 2008/05/12_20:57:52 debug: do_update_resource: Sent
resource state update message: 107
crmd[10475]: 2008/05/12_20:57:52 debug: process_lrm_event: Op
drbd0:1_demote_0 (call=79): Confirmed
--
Don't just do something...sit there!
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems