Dan:

I attempted to re-connect r1 and then captured the results in the kernel ring 
buffer:

[ 7038.868118] d-con r1: conn( StandAlone -> Unconnected ) 
[ 7038.868170] d-con r1: Starting receiver thread (from drbd_w_r1 [5058])
[ 7038.868238] d-con r1: receiver (re)started
[ 7038.868269] d-con r1: conn( Unconnected -> WFConnection ) 
[ 7039.367612] d-con r1: Handshake successful: Agreed network protocol version 
100
[ 7039.367818] d-con r1: conn( WFConnection -> WFReportParams ) 
[ 7039.367851] d-con r1: Starting asender thread (from drbd_r_r1 [20387])
[ 7039.376119] block drbd1000: drbd_sync_handshake:
[ 7039.376127] block drbd1000: self 
E624A5F197121810:701073FA8F926E0E:B8DFF1CE13CF5A28:B8DEF1CE13CF5A29 bits:0 
flags:0
[ 7039.376133] block drbd1000: peer 
221AECCC2A594D3B:701073FA8F926E0E:B8DFF1CE13CF5A29:B8DEF1CE13CF5A29 bits:0 
flags:0
[ 7039.376139] block drbd1000: uuid_compare()=100 by rule 90
[ 7039.376146] block drbd1000: helper command: /sbin/drbdadm 
initial-split-brain minor-1000
[ 7039.378519] block drbd1000: helper command: /sbin/drbdadm 
initial-split-brain minor-1000 exit code 0 (0x0)
[ 7039.378544] block drbd1000: Split-Brain detected but unresolved, dropping 
connection!
[ 7039.378551] block drbd1000: helper command: /sbin/drbdadm split-brain 
minor-1000
[ 7039.381167] block drbd1000: helper command: /sbin/drbdadm split-brain 
minor-1000 exit code 0 (0x0)
[ 7039.381228] d-con r1: conn( WFReportParams -> Disconnecting ) 
[ 7039.381237] d-con r1: error receiving ReportState, e: -5 l: 0!
[ 7039.381274] d-con r1: asender terminated
[ 7039.381283] d-con r1: Terminating asender thread
[ 7039.381539] d-con r1: Connection closed
[ 7039.381567] d-con r1: conn( Disconnecting -> StandAlone ) 
[ 7039.381573] d-con r1: receiver terminated
[ 7039.381577] d-con r1: Terminating receiver thread


I have no idea what it means, however.

Eric Pretorious
Truckee, CA




>________________________________
> From: Dan Barker <[email protected]>
>To: "[email protected]" <[email protected]> 
>Sent: Monday, January 21, 2013 6:40 AM
>Subject: Re: [DRBD-user] Diagnosing a Failed Resource
> 
>
> 
>The errors in connecting are logged. If you can’t find them, attempt to 
>connect a resource (drbdadm connect r1, for example) to create the errors 
>again, and then look at the logs for the reason the connection was not 
>established. The “status” will continue to show waiting for connection (WFC) 
>but there will be a reason in the log files. If the logs are unclear, post the 
>relevant portions back here and we’ll help.
> 
>Something like ‘dmesg | grep drbd’. You may want to do the logs on both drbd 
>servers. You can do the connect command on either.
> 
>hth
> 
>Dan
> 
>From:[email protected] 
>[mailto:[email protected]] On Behalf Of Eric
>Sent: Monday, January 21, 2013 1:24 AM
>To: [email protected]
>Subject: [DRBD-user] Diagnosing a Failed Resource
> 
>I've configured corosync+pacemaker to managee a simple two-resource DRBD 
>cluster:
> 
>> san1:~ # crm configure show | cat -
>> node san1 \
>>     attributes standby="off"
>> node san2 \
>>     attributes standby="off"
>> primitive p_DRBD-r0 ocf:linbit:drbd \
>>     params drbd_resource="r0" \
>>     op monitor interval="60s"
>> primitive p_DRBD-r1 ocf:linbit:drbd \
>>     params drbd_resource="r1" \
>>     op monitor interval="60s"
>> primitive p_IP-1_253 ocf:heartbeat:IPaddr2 \
>>     params ip="192.168.1.253" cidr_netmask="24" \
>>     op monitor interval="30s"
>> primitive p_IP-1_254 ocf:heartbeat:IPaddr2 \
>>     params ip="192.168.1.254" cidr_netmask="24" \
>>     op monitor interval="30s"
>> primitive p_iSCSI-san1 ocf:heartbeat:iSCSITarget \
>>     params iqn="iqn.2012-11.com.example.san1:sda" \
>>     op monitor interval="10s"
>> primitive p_iSCSI-san1_0 ocf:heartbeat:iSCSILogicalUnit \
>>     params target_iqn="iqn.2012-11.com.example.san1:sda" lun="0" 
>> path="/dev/drbd0" \
>>     op monitor interval="10s"
>> primitive p_iSCSI-san1_1 ocf:heartbeat:iSCSILogicalUnit \
>>     params target_iqn="iqn.2012-11.com.example.san1:sda" lun="1" 
>> path="/dev/drbd1" \
>>     op monitor interval="10s"
>> primitive p_iSCSI-san1_2 ocf:heartbeat:iSCSILogicalUnit \
>>     params target_iqn="iqn.2012-11.com.example.san1:sda" lun="2" 
>> path="/dev/drbd2" \
>>     op monitor interval="10s"
>> primitive p_iSCSI-san1_3 ocf:heartbeat:iSCSILogicalUnit \
>>     params target_iqn="iqn.2012-11.com.example.san1:sda" lun="3" 
>> path="/dev/drbd3" \
>>     op monitor interval="10s"
>> primitive p_iSCSI-san2 ocf:heartbeat:iSCSITarget \
>>     params iqn="iqn.2012-11.com.example.san2:sda" \
>>     op monitor interval="10s"
>> primitive p_iSCSI-san2_0 ocf:heartbeat:iSCSILogicalUnit \
>>     params target_iqn="iqn.2012-11.com.example.san2:sda" lun="0" 
>> path="/dev/drbd1000" \
>>     op monitor interval="10s"
>> primitive p_iSCSI-san2_1 ocf:heartbeat:iSCSILogicalUnit \
>>     params target_iqn="iqn.2012-11.com.example.san2:sda" lun="1" 
>> path="/dev/drbd1001" \
>>     op monitor interval="10s"
>> primitive p_iSCSI-san2_2 ocf:heartbeat:iSCSILogicalUnit \
>>     params target_iqn="iqn.2012-11.com.example.san2:sda" lun="2" 
>> path="/dev/drbd1002" \
>>     op monitor interval="10s"
>> primitive p_iSCSI-san2_3 ocf:heartbeat:iSCSILogicalUnit \
>>     params target_iqn="iqn.2012-11.com.example.san2:sda" lun="3" 
>> path="/dev/drbd1003" \
>>     op monitor interval="10s"
>> group g_iSCSI-san1 p_iSCSI-san1 p_iSCSI-san1_0 p_iSCSI-san1_1 p_iSCSI-san1_2 
>> p_iSCSI-san1_3 p_IP-1_254
>> group g_iSCSI-san2 p_iSCSI-san2 p_iSCSI-san2_0 p_iSCSI-san2_1 p_iSCSI-san2_2 
>> p_iSCSI-san2_3 p_IP-1_253
>> ms ms_DRBD-r0 p_DRBD-r0 \
>>     meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" 
>> notify="true"
>> ms ms_DRBD-r1 p_DRBD-r1 \
>>     meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" 
>> notify="true"
>> location l_iSCSI-san1_and_DRBD-r0 p_IP-1_254 10240: san1
>> location l_iSCSI-san2_and_DRBD-r1 p_IP-1_253 10240: san2
>> colocation c_iSCSI_with_DRBD-r0 inf: g_iSCSI-san1 ms_DRBD-r0:Master
>> colocation c_iSCSI_with_DRBD-r1 inf: g_iSCSI-san2 ms_DRBD-r1:Master
>> order o_DRBD-r0_before_iSCSI-san1 inf: ms_DRBD-r0:promote g_iSCSI-san1:start
>> order o_DRBD-r1_before_iSCSI-san2 inf: ms_DRBD-r1:promote g_iSCSI-san2:start
>> property $id="cib-bootstrap-options" \
>>     dc-version="1.1.7-77eeb099a504ceda05d648ed161ef8b1582c7daf" \
>>     cluster-infrastructure="openais" \
>>     expected-quorum-votes="2" \
>>     stonith-enabled="false" \
>>     no-quorum-policy="ignore"
> 
>The cluster appears to be functioning correctly:
> 
>> san1:~ # crm_mon -1
>> ============
>> Last updated: Sun Jan 20 22:20:17 2013
>> Last change: Sun Jan 20 21:59:15 2013 by root via crm_attribute on san1
>> Stack: openais
>> Current DC: san1 - partition with quorum
>> Version: 1.1.7-77eeb099a504ceda05d648ed161ef8b1582c7daf
>> 2 Nodes configured, 2 expected votes
>> 16 Resources configured.
>> ============
>> 
>> Online: [ san1 san2 ]
>> 
>>  Master/Slave Set: ms_DRBD-r0 [p_DRBD-r0]
>>      Masters: [ san1 ]
>>      Slaves: [ san2 ]
>>  Resource Group: g_iSCSI-san1
>>      p_iSCSI-san1    (ocf::heartbeat:iSCSITarget):    Started san1
>>      p_iSCSI-san1_0    (ocf::heartbeat:iSCSILogicalUnit):    Started san1
>>      p_iSCSI-san1_1    (ocf::heartbeat:iSCSILogicalUnit):    Started san1
>>      p_iSCSI-san1_2    (ocf::heartbeat:iSCSILogicalUnit):    Started san1
>>      p_iSCSI-san1_3    (ocf::heartbeat:iSCSILogicalUnit):    Started san1
>>      p_IP-1_254    (ocf::heartbeat:IPaddr2):    Started san1
>>  Master/Slave Set: ms_DRBD-r1 [p_DRBD-r1]
>>      Masters: [ san2 ]
>>      Slaves: [ san1 ]
>>  Resource Group: g_iSCSI-san2
>>      p_iSCSI-san2    (ocf::heartbeat:iSCSITarget):    Started san2
>>      p_iSCSI-san2_0    (ocf::heartbeat:iSCSILogicalUnit):    Started san2
>>      p_iSCSI-san2_1    (ocf::heartbeat:iSCSILogicalUnit):    Started san2
>>      p_iSCSI-san2_2    (ocf::heartbeat:iSCSILogicalUnit):    Started san2
>>      p_iSCSI-san2_3    (ocf::heartbeat:iSCSILogicalUnit):    Started san2
>>      p_IP-1_253    (ocf::heartbeat:IPaddr2):    Started san2
>
>> san2:~ # crm_mon -1
>> ============
>> Last updated: Sun Jan 20 22:20:17 2013
>> Last change: Sun Jan 20 21:59:15 2013 by root via crm_attribute on san1
>> Stack: openais
>> Current DC: san1 - partition with quorum
>> Version: 1.1.7-77eeb099a504ceda05d648ed161ef8b1582c7daf
>> 2 Nodes configured, 2 expected votes
>> 16 Resources configured.
>> ============
>> 
>> Online: [ san1 san2 ]
>> 
>>  Master/Slave Set: ms_DRBD-r0 [p_DRBD-r0]
>>      Masters: [ san1 ]
>>      Slaves: [ san2 ]
>>  Resource Group: g_iSCSI-san1
>>      p_iSCSI-san1    (ocf::heartbeat:iSCSITarget):    Started san1
>>      p_iSCSI-san1_0    (ocf::heartbeat:iSCSILogicalUnit):    Started san1
>>      p_iSCSI-san1_1    (ocf::heartbeat:iSCSILogicalUnit):    Started san1
>>      p_iSCSI-san1_2    (ocf::heartbeat:iSCSILogicalUnit):    Started san1
>>      p_iSCSI-san1_3    (ocf::heartbeat:iSCSILogicalUnit):    Started san1
>>      p_IP-1_254    (ocf::heartbeat:IPaddr2):    Started san1
>>  Master/Slave Set: ms_DRBD-r1 [p_DRBD-r1]
>>      Masters: [ san2 ]
>>      Slaves: [ san1 ]
>>  Resource Group: g_iSCSI-san2
>>      p_iSCSI-san2    (ocf::heartbeat:iSCSITarget):    Started san2
>>      p_iSCSI-san2_0    (ocf::heartbeat:iSCSILogicalUnit):    Started san2
>>      p_iSCSI-san2_1    (ocf::heartbeat:iSCSILogicalUnit):    Started san2
>>      p_iSCSI-san2_2    (ocf::heartbeat:iSCSILogicalUnit):    Started san2
>>      p_iSCSI-san2_3    (ocf::heartbeat:iSCSILogicalUnit):    Started san2
>>      p_IP-1_253    (ocf::heartbeat:IPaddr2):    Started san2
>However, the two DRBD resources do not appear to be communicating:
> 
>> san1:~ # cat /proc/drbd 
>> version: 8.4.1 (api:1/proto:86-100)
>> GIT-hash: 91b4c048c1a0e06777b5f65d312b38d47abaea80 build by phil@fat-tyre, 
>> 2011-12-20 12:43:15
>>  0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
>>     ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:3259080
>>  1: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
>>     ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>>  2: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
>>     ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>>  3: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
>>     ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>> 
>> 1000: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
>>     ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>> 1001: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
>>     ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>> 1002: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
>>     ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>> 1003: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
>>     ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
> 
>> san2:~ # cat /proc/drbd 
>> version: 8.4.1 (api:1/proto:86-100)
>> GIT-hash: 91b4c048c1a0e06777b5f65d312b38d47abaea80 build by phil@fat-tyre, 
>> 2011-12-20 12:43:15
>>  0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
>>     ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:140
>>  1: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
>>     ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>>  2: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
>>     ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>>  3: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
>>     ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>> 
>> 1000: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
>>     ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>> 1001: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
>>     ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>> 1002: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
>>     ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>> 1003: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
>>     ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>How can I begin to troubleshoot this error?
> 
>Eric Pretorious
>Truckee, cA
>_______________________________________________
>drbd-user mailing list
>[email protected]
>http://lists.linbit.com/mailman/listinfo/drbd-user
>
>
>
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to