Hi,
I have managed to get this setup to work in a virtual environment, but am
running into problems when running on physical hardware. My current setup is
the latest debian linux with the approved packages for drbd and heartbeat
This is my heartbeat status when running.
root@HPSATA01:~# crm_mon -1
============
Last updated: Tue Apr 2 11:14:32 2013
Stack: Heartbeat
Current DC: hpsata02 (9e01243d-9232-437e-81ad-e00aabffef8f) - partition with
quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, unknown expected votes
2 Resources configured.
============
Online: [ hpsata01 hpsata02 ]
Master/Slave Set: ms_g_drbd
Masters: [ hpsata01 ]
Slaves: [ hpsata02 ]
Resource Group: g_services
MetaFS (ocf::heartbeat:Filesystem): Started hpsata01
lvmdata (ocf::heartbeat:LVM): Started hpsata01
ClusterIP (ocf::heartbeat:IPaddr2): Started hpsata01
iscsi (lsb:iscsitarget): Started hpsata01
Failed actions:
iscsi_monitor_0 (node=hpsata01, call=7, rc=1, status=complete): unknown
error
iscsi_monitor_0 (node=hpsata02, call=105, rc=1, status=complete): unknown
error
root@HPSATA01:~#
And my DRBD status:
root@HPSATA01:~# service drbd status
drbd driver loaded OK; device status:
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
m:res cs ro ds p mounted fstype
0:meta Connected Primary/Secondary UpToDate/UpToDate C /meta ext3
1:data Connected Primary/Secondary UpToDate/UpToDate C
root@HPSATA01:~#
So far so good. The problem arises when I put the primary node to standby.
The same thing happens when I reboot, power off, or pull the plug.
root@HPSATA01:~# crm_mon -1
============
Last updated: Tue Apr 2 12:54:56 2013
Stack: Heartbeat
Current DC: hpsata02 (9e01243d-9232-437e-81ad-e00aabffef8f) - partition with
quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, unknown expected votes
2 Resources configured.
============
Node hpsata01 (d60c3fcb-55e4-44a1-9e7a-2b03efb501e8): standby
Online: [ hpsata02 ]
Master/Slave Set: ms_g_drbd
Slaves: [ hpsata02 ]
Stopped: [ g_drbd:0 ]
On node 2 this is the DRBD status
root@HPSATA02:~# service drbd status
drbd driver loaded OK; device status:
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
m:res cs ro ds p mounted fstype
0:meta WFConnection Secondary/Unknown Outdated/DUnknown C
1:data WFConnection Secondary/Unknown Outdated/DUnknown C
root@HPSATA02:~#
Here is some info from my ha-debug file:
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: unpack_config: On loss of CCM
Quorum: Ignore
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: unpack_config: Node scores:
'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: unpack_status: Node hpsata01 is
in standby-mode
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: determine_online_status: Node
hpsata01 is standby
Apr 02 12:53:33 HPSATA02 pengine: [1892]: WARN: unpack_rsc_op: Processing
failed op iscsi_monitor_0 on hpsata01: unknown error (1)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: determine_online_status: Node
hpsata02 is online
Apr 02 12:53:33 HPSATA02 pengine: [1892]: WARN: unpack_rsc_op: Processing
failed op iscsi_monitor_0 on hpsata02: unknown error (1)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: clone_print: Master/Slave
Set: ms_g_drbd
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: short_print: Slaves: [
hpsata02 ]
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: short_print: Stopped: [
g_drbd:0 ]
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: group_print: Resource Group:
g_services
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: native_print: MetaFS
(ocf::heartbeat:Filesystem): Stopped
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: native_print: lvmdata
(ocf::heartbeat:LVM): Stopped
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: native_print: ClusterIP
(ocf::heartbeat:IPaddr2): Stopped
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: native_print: iscsi
(lsb:iscsitarget): Stopped
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: native_merge_weights:
drbd_meta:0: Rolling back scores from drbd_data:0
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: native_color: Resource
drbd_meta:0 cannot run anywhere
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: native_color: Resource
drbd_data:0 cannot run anywhere
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: master_color: Promoting
g_drbd:1 (Slave hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: master_color: ms_g_drbd:
Promoted 1 instances of a possible 1 to master
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: master_color: Promoting
g_drbd:1 (Slave hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: master_color: ms_g_drbd:
Promoted 1 instances of a possible 1 to master
Apr 02 12:53:33 HPSATA02 pengine: [1892]: ERROR:
create_notification_boundaries: Creating boundaries for ms_g_drbd
Apr 02 12:53:33 HPSATA02 pengine: [1892]: ERROR:
create_notification_boundaries: Creating boundaries for ms_g_drbd
Apr 02 12:53:33 HPSATA02 pengine: [1892]: ERROR:
create_notification_boundaries: Creating boundaries for ms_g_drbd
Apr 02 12:53:33 HPSATA02 pengine: [1892]: ERROR:
create_notification_boundaries: Creating boundaries for ms_g_drbd
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: RecurringOp: Start recurring
monitor (30s) for ClusterIP on hpsata02
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Leave resource
drbd_meta:0 (Stopped)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Leave resource
drbd_data:0 (Stopped)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Promote
drbd_meta:1 (Slave -> Master hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Promote
drbd_data:1 (Slave -> Master hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Start MetaFS
(hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Start lvmdata
(hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Start ClusterIP
(hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Start iscsi
(hpsata02)
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: do_state_transition: State
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS
cause=C_IPC_MESSAGE origin=handle_response ]
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: unpack_graph: Unpacked transition
51: 21 actions in 21 synapses
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: do_te_invoke: Processing graph 51
(ref=pe_calc-dc-1364928813-381) derived from /var/lib/pengine/pe-input-1669.bz2
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 43
fired and confirmed
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action
84: notify drbd_meta:1_pre_notify_promote_0 on hpsata02 (local)
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing
key=84:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_meta:1_notify_0 )
Apr 02 12:53:33 HPSATA02 lrmd: [1883]: info: rsc:drbd_meta:1:143: notify
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action
86: notify drbd_data:1_pre_notify_promote_0 on hpsata02 (local)
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing
key=86:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_data:1_notify_0 )
Apr 02 12:53:33 HPSATA02 lrmd: [1883]: info: rsc:drbd_data:1:144: notify
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: process_pe_message: Transition
51: PEngine Input stored in: /var/lib/pengine/pe-input-1669.bz2
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation
drbd_meta:1_notify_0 (call=143, rc=0, cib-update=254, confirmed=true) ok
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation
drbd_data:1_notify_0 (call=144, rc=0, cib-update=255, confirmed=true) ok
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: match_graph_event: Action
drbd_meta:1_pre_notify_promote_0 (84) confirmed on hpsata02 (rc=0)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: match_graph_event: Action
drbd_data:1_pre_notify_promote_0 (86) confirmed on hpsata02 (rc=0)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 44
fired and confirmed
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 41
fired and confirmed
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 27
fired and confirmed
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action
16: promote drbd_meta:1_promote_0 on hpsata02 (local)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing
key=16:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_meta:1_promote_0 )
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: rsc:drbd_meta:1:145: promote
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output:
(drbd_meta:1:promote:stdout)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation
drbd_meta:1_promote_0 (call=145, rc=0, cib-update=256, confirmed=true) ok
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: match_graph_event: Action
drbd_meta:1_promote_0 (16) confirmed on hpsata02 (rc=0)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action
20: promote drbd_data:1_promote_0 on hpsata02 (local)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing
key=20:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_data:1_promote_0 )
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: rsc:drbd_data:1:146: promote
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output:
(drbd_data:1:promote:stdout)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation
drbd_data:1_promote_0 (call=146, rc=0, cib-update=257, confirmed=true) ok
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: match_graph_event: Action
drbd_data:1_promote_0 (20) confirmed on hpsata02 (rc=0)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 28
fired and confirmed
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 42
fired and confirmed
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 45
fired and confirmed
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action
85: notify drbd_meta:1_post_notify_promote_0 on hpsata02 (local)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing
key=85:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_meta:1_notify_0 )
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: rsc:drbd_meta:1:147: notify
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action
87: notify drbd_data:1_post_notify_promote_0 on hpsata02 (local)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing
key=87:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_data:1_notify_0 )
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: rsc:drbd_data:1:148: notify
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output:
(drbd_meta:1:notify:stdout)
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output:
(drbd_data:1:notify:stdout)
drbd[2542]: 2013/04/02_12:53:34 ERROR: meta: Called drbdadm -c /etc/drbd.conf
outdate meta
drbd[2543]: 2013/04/02_12:53:34 ERROR: data: Called drbdadm -c /etc/drbd.conf
outdate data
drbd[2542]: 2013/04/02_12:53:34 ERROR: meta: Exit code 17
drbd[2543]: 2013/04/02_12:53:34 ERROR: data: Exit code 17
drbd[2542]: 2013/04/02_12:53:34 ERROR: meta: Command output:
drbd[2543]: 2013/04/02_12:53:34 ERROR: data: Command output:
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output:
(drbd_meta:1:notify:stdout)
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output:
(drbd_data:1:notify:stdout)
Apr 02 12:53:34 HPSATA02 attrd: [1885]: info: attrd_trigger_update: Sending
flush op to all hosts for: master-drbd_meta:1 (-INFINITY)
Apr 02 12:53:34 HPSATA02 attrd: [1885]: info: attrd_perform_update: Sent update
137: master-drbd_meta:1=-INFINITY
Apr 02 12:53:34 HPSATA02 attrd: [1885]: info: attrd_trigger_update: Sending
flush op to all hosts for: master-drbd_data:1 (-INFINITY)
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output:
(drbd_meta:1:notify:stdout)
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output:
(drbd_data:1:notify:stdout)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation
drbd_meta:1_notify_0 (call=147, rc=0, cib-update=258, confirmed=true) ok
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation
drbd_data:1_notify_0 (call=148, rc=0, cib-update=259, confirmed=true) ok
Once I bring node 1 back up everything works fine again.
If anyone has a clue as to what is going on I would appreciate the help.
Thanks,
Jared.
________________________________
This email and any files transmitted with it are intended solely for the use of
the individuals or entities to which they are addressed and are not to be
disclosed to any other party. If you have received this email in error please
return it to the sender, and delete any copies thereof. Thank you for your
cooperation.
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user