Hi,

I have managed to get this setup to work in a virtual environment, but am 
running into problems when running on physical hardware.  My current setup is 
the latest debian linux with the approved packages for drbd and heartbeat

This is my heartbeat status when running.

root@HPSATA01:~# crm_mon -1
============
Last updated: Tue Apr  2 11:14:32 2013
Stack: Heartbeat
Current DC: hpsata02 (9e01243d-9232-437e-81ad-e00aabffef8f) - partition with 
quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, unknown expected votes
2 Resources configured.
============

Online: [ hpsata01 hpsata02 ]

Master/Slave Set: ms_g_drbd
     Masters: [ hpsata01 ]
     Slaves: [ hpsata02 ]
Resource Group: g_services
     MetaFS     (ocf::heartbeat:Filesystem):    Started hpsata01
     lvmdata    (ocf::heartbeat:LVM):   Started hpsata01
     ClusterIP  (ocf::heartbeat:IPaddr2):       Started hpsata01
     iscsi      (lsb:iscsitarget):      Started hpsata01

Failed actions:
    iscsi_monitor_0 (node=hpsata01, call=7, rc=1, status=complete): unknown 
error
    iscsi_monitor_0 (node=hpsata02, call=105, rc=1, status=complete): unknown 
error
root@HPSATA01:~#

And my DRBD status:

root@HPSATA01:~# service drbd status
drbd driver loaded OK; device status:
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
m:res   cs         ro                 ds                 p  mounted  fstype
0:meta  Connected  Primary/Secondary  UpToDate/UpToDate  C  /meta    ext3
1:data  Connected  Primary/Secondary  UpToDate/UpToDate  C
root@HPSATA01:~#

So far so good.  The problem arises when I put the primary node to standby.  
The same thing happens when I reboot, power off, or pull the plug.

root@HPSATA01:~# crm_mon -1
============
Last updated: Tue Apr  2 12:54:56 2013
Stack: Heartbeat
Current DC: hpsata02 (9e01243d-9232-437e-81ad-e00aabffef8f) - partition with 
quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, unknown expected votes
2 Resources configured.
============

Node hpsata01 (d60c3fcb-55e4-44a1-9e7a-2b03efb501e8): standby
Online: [ hpsata02 ]

Master/Slave Set: ms_g_drbd
     Slaves: [ hpsata02 ]
     Stopped: [ g_drbd:0 ]

On node 2 this is the DRBD status

root@HPSATA02:~# service drbd status
drbd driver loaded OK; device status:
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
m:res   cs            ro                 ds                 p  mounted  fstype
0:meta  WFConnection  Secondary/Unknown  Outdated/DUnknown  C
1:data  WFConnection  Secondary/Unknown  Outdated/DUnknown  C
root@HPSATA02:~#

Here is some info from my ha-debug file:

Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: unpack_config: On loss of CCM 
Quorum: Ignore
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: unpack_config: Node scores: 
'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: unpack_status: Node hpsata01 is 
in standby-mode
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: determine_online_status: Node 
hpsata01 is standby
Apr 02 12:53:33 HPSATA02 pengine: [1892]: WARN: unpack_rsc_op: Processing 
failed op iscsi_monitor_0 on hpsata01: unknown error (1)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: determine_online_status: Node 
hpsata02 is online
Apr 02 12:53:33 HPSATA02 pengine: [1892]: WARN: unpack_rsc_op: Processing 
failed op iscsi_monitor_0 on hpsata02: unknown error (1)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: clone_print:  Master/Slave 
Set: ms_g_drbd
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: short_print:      Slaves: [ 
hpsata02 ]
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: short_print:      Stopped: [ 
g_drbd:0 ]
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: group_print:  Resource Group: 
g_services
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: native_print:      MetaFS 
(ocf::heartbeat:Filesystem):       Stopped
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: native_print:      lvmdata    
  (ocf::heartbeat:LVM):       Stopped
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: native_print:      ClusterIP  
     (ocf::heartbeat:IPaddr2):  Stopped
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: native_print:      iscsi 
(lsb:iscsitarget):   Stopped
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: native_merge_weights: 
drbd_meta:0: Rolling back scores from drbd_data:0
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: native_color: Resource 
drbd_meta:0 cannot run anywhere
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: native_color: Resource 
drbd_data:0 cannot run anywhere
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: master_color: Promoting 
g_drbd:1 (Slave hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: master_color: ms_g_drbd: 
Promoted 1 instances of a possible 1 to master
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: master_color: Promoting 
g_drbd:1 (Slave hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: master_color: ms_g_drbd: 
Promoted 1 instances of a possible 1 to master
Apr 02 12:53:33 HPSATA02 pengine: [1892]: ERROR: 
create_notification_boundaries: Creating boundaries for ms_g_drbd
Apr 02 12:53:33 HPSATA02 pengine: [1892]: ERROR: 
create_notification_boundaries: Creating boundaries for ms_g_drbd
Apr 02 12:53:33 HPSATA02 pengine: [1892]: ERROR: 
create_notification_boundaries: Creating boundaries for ms_g_drbd
Apr 02 12:53:33 HPSATA02 pengine: [1892]: ERROR: 
create_notification_boundaries: Creating boundaries for ms_g_drbd
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: RecurringOp:  Start recurring 
monitor (30s) for ClusterIP on hpsata02
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Leave resource 
drbd_meta:0 (Stopped)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Leave resource 
drbd_data:0 (Stopped)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Promote 
drbd_meta:1 (Slave -> Master hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Promote 
drbd_data:1 (Slave -> Master hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Start MetaFS 
(hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Start lvmdata 
(hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Start ClusterIP   
  (hpsata02)
Apr 02 12:53:33 HPSATA02 pengine: [1892]: notice: LogActions: Start iscsi  
(hpsata02)
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response ]
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: unpack_graph: Unpacked transition 
51: 21 actions in 21 synapses
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: do_te_invoke: Processing graph 51 
(ref=pe_calc-dc-1364928813-381) derived from /var/lib/pengine/pe-input-1669.bz2
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 43 
fired and confirmed
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action 
84: notify drbd_meta:1_pre_notify_promote_0 on hpsata02 (local)
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing 
key=84:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_meta:1_notify_0 )
Apr 02 12:53:33 HPSATA02 lrmd: [1883]: info: rsc:drbd_meta:1:143: notify
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action 
86: notify drbd_data:1_pre_notify_promote_0 on hpsata02 (local)
Apr 02 12:53:33 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing 
key=86:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_data:1_notify_0 )
Apr 02 12:53:33 HPSATA02 lrmd: [1883]: info: rsc:drbd_data:1:144: notify
Apr 02 12:53:33 HPSATA02 pengine: [1892]: info: process_pe_message: Transition 
51: PEngine Input stored in: /var/lib/pengine/pe-input-1669.bz2
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation 
drbd_meta:1_notify_0 (call=143, rc=0, cib-update=254, confirmed=true) ok
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation 
drbd_data:1_notify_0 (call=144, rc=0, cib-update=255, confirmed=true) ok
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: match_graph_event: Action 
drbd_meta:1_pre_notify_promote_0 (84) confirmed on hpsata02 (rc=0)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: match_graph_event: Action 
drbd_data:1_pre_notify_promote_0 (86) confirmed on hpsata02 (rc=0)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 44 
fired and confirmed
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 41 
fired and confirmed
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 27 
fired and confirmed
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action 
16: promote drbd_meta:1_promote_0 on hpsata02 (local)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing 
key=16:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_meta:1_promote_0 )
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: rsc:drbd_meta:1:145: promote
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: 
(drbd_meta:1:promote:stdout)

Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation 
drbd_meta:1_promote_0 (call=145, rc=0, cib-update=256, confirmed=true) ok
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: match_graph_event: Action 
drbd_meta:1_promote_0 (16) confirmed on hpsata02 (rc=0)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action 
20: promote drbd_data:1_promote_0 on hpsata02 (local)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing 
key=20:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_data:1_promote_0 )
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: rsc:drbd_data:1:146: promote
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: 
(drbd_data:1:promote:stdout)

Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation 
drbd_data:1_promote_0 (call=146, rc=0, cib-update=257, confirmed=true) ok
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: match_graph_event: Action 
drbd_data:1_promote_0 (20) confirmed on hpsata02 (rc=0)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 28 
fired and confirmed
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 42 
fired and confirmed
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_pseudo_action: Pseudo action 45 
fired and confirmed
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action 
85: notify drbd_meta:1_post_notify_promote_0 on hpsata02 (local)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing 
key=85:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_meta:1_notify_0 )
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: rsc:drbd_meta:1:147: notify
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: te_rsc_command: Initiating action 
87: notify drbd_data:1_post_notify_promote_0 on hpsata02 (local)
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: do_lrm_rsc_op: Performing 
key=87:51:0:d21de83f-ed1b-428e-845e-937210405b1c op=drbd_data:1_notify_0 )
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: rsc:drbd_data:1:148: notify
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: 
(drbd_meta:1:notify:stdout)

Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: 
(drbd_data:1:notify:stdout)

drbd[2542]:   2013/04/02_12:53:34 ERROR: meta: Called drbdadm -c /etc/drbd.conf 
outdate meta
drbd[2543]:   2013/04/02_12:53:34 ERROR: data: Called drbdadm -c /etc/drbd.conf 
outdate data
drbd[2542]:   2013/04/02_12:53:34 ERROR: meta: Exit code 17
drbd[2543]:   2013/04/02_12:53:34 ERROR: data: Exit code 17
drbd[2542]:   2013/04/02_12:53:34 ERROR: meta: Command output:
drbd[2543]:   2013/04/02_12:53:34 ERROR: data: Command output:
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: 
(drbd_meta:1:notify:stdout)

Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: 
(drbd_data:1:notify:stdout)

Apr 02 12:53:34 HPSATA02 attrd: [1885]: info: attrd_trigger_update: Sending 
flush op to all hosts for: master-drbd_meta:1 (-INFINITY)
Apr 02 12:53:34 HPSATA02 attrd: [1885]: info: attrd_perform_update: Sent update 
137: master-drbd_meta:1=-INFINITY
Apr 02 12:53:34 HPSATA02 attrd: [1885]: info: attrd_trigger_update: Sending 
flush op to all hosts for: master-drbd_data:1 (-INFINITY)
Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: 
(drbd_meta:1:notify:stdout)

Apr 02 12:53:34 HPSATA02 lrmd: [1883]: info: RA output: 
(drbd_data:1:notify:stdout)

Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation 
drbd_meta:1_notify_0 (call=147, rc=0, cib-update=258, confirmed=true) ok
Apr 02 12:53:34 HPSATA02 crmd: [1886]: info: process_lrm_event: LRM operation 
drbd_data:1_notify_0 (call=148, rc=0, cib-update=259, confirmed=true) ok

Once I bring node 1 back up everything works fine again.

If anyone has a clue as to what is going on I would appreciate the help.

Thanks,

Jared.
________________________________
This email and any files transmitted with it are intended solely for the use of 
the individuals or entities to which they are addressed and are not to be 
disclosed to any other party. If you have received this email in error please 
return it to the sender, and delete any copies thereof. Thank you for your 
cooperation.
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to