Lars:
That broken pipe messages from my experience always show up on the master, 
when the slave produces the digest mismatch messages. from my point of 
view this is a bit misleading.
Here an example:
Master node:
Mar  1 08:43:37 node1 kernel: block drbd4: meta connection shut down by 
peer.
Mar  1 08:43:37 node1 kernel: block drbd4: sock was shut down by peer
Mar  1 08:43:37 node1 kernel: block drbd4: peer( Secondary -> Unknown ) 
conn( Connected -> BrokenPipe ) pdsk( UpToDate -> DUnknown ) 
Mar  1 08:43:37 node1 kernel: block drbd4: short read expecting header on 
sock: r=0
Mar  1 08:43:37 node1 kernel: block drbd4: asender terminated
Mar  1 08:43:37 node1 kernel: block drbd4: Creating new current UUID
Mar  1 08:43:37 node1 kernel: block drbd4: Terminating asender thread
Mar  1 08:43:37 node1 kernel: block drbd4: sock_sendmsg returned -32
Mar  1 08:43:37 node1 kernel: block drbd4: short sent ReportUUIDs size=56 
sent=0
Mar  1 08:43:37 node1 kernel: block drbd4: Connection closed
Mar  1 08:43:37 node1 kernel: block drbd4: helper command: /sbin/drbdadm 
fence-peer minor-4
Mar  1 08:43:38 node1 cibadmin: [17587]: info: Invoked: cibadmin -C -o 
constraints -X <rsc_location rsc="ms_drbd" 
id="drbd-fence-by-handler-ms_drbd">   <rule role="Master" 
score="-INFINITY" id="drbd-fence-by-handler-rule-ms_drbd">     <expression 
attribute="#uname" operation="ne" value="node1" 
id="drbd-fence-by-handler-expr-ms_drbd"/>   </rule> </rsc_location> 
Mar  1 08:43:38 node1 kernel: block drbd4: helper command: /sbin/drbdadm 
fence-peer minor-4 exit code 4 (0x400)
Mar  1 08:43:38 node1 kernel: block drbd4: fence-peer helper returned 4 
(peer was fenced)
Mar  1 08:43:38 node1 kernel: block drbd4: pdsk( DUnknown -> Outdated ) 
Mar  1 08:43:38 node1 kernel: block drbd4: conn( BrokenPipe -> Unconnected 
) 
Mar  1 08:43:38 node1 kernel: block drbd4: receiver terminated
Mar  1 08:43:38 node1 kernel: block drbd4: Restarting receiver thread
Mar  1 08:43:38 node1 kernel: block drbd4: receiver (re)started
Mar  1 08:43:38 node1 kernel: block drbd4: conn( Unconnected -> 
WFConnection ) 
Mar  1 08:43:38 node1 kernel: block drbd4: Handshake successful: Agreed 
network protocol version 91
Mar  1 08:43:38 node1 kernel: block drbd4: conn( WFConnection -> 
WFReportParams ) 
Mar  1 08:43:38 node1 kernel: block drbd4: Starting asender thread (from 
drbd4_receiver [10020])
Mar  1 08:43:38 node1 kernel: block drbd4: data-integrity-alg: md5
Mar  1 08:43:38 node1 kernel: block drbd4: drbd_sync_handshake:
Mar  1 08:43:38 node1 kernel: block drbd4: self 
0CC3D5A17A19D531:4064274FA17AB19D:00FFCB9ECAD3A6D4:50A168C54CD13071 
bits:122 flags:0
Mar  1 08:43:38 node1 kernel: block drbd4: peer 
4064274FA17AB19C:0000000000000000:00FFCB9ECAD3A6D4:50A168C54CD13071 bits:0 
flags:0
Mar  1 08:43:38 node1 kernel: block drbd4: uuid_compare()=1 by rule 70
Mar  1 08:43:38 node1 kernel: block drbd4: peer( Unknown -> Secondary ) 
conn( WFReportParams -> WFBitMapS ) pdsk( Outdated -> UpToDate ) 
Mar  1 08:43:38 node1 kernel: block drbd4: conn( WFBitMapS -> SyncSource ) 
pdsk( UpToDate -> Inconsistent ) 
Mar  1 08:43:38 node1 kernel: block drbd4: Began resync as SyncSource 
(will sync 488 KB [122 bits set]).
Mar  1 08:43:38 node1 kernel: block drbd4: Resync done (total 1 sec; 
paused 0 sec; 488 K/sec)
Mar  1 08:43:38 node1 kernel: block drbd4: conn( SyncSource -> Connected ) 
pdsk( Inconsistent -> UpToDate ) 


Slave Node:
Mar  1 08:43:37 node2 kernel: block drbd4: Digest integrity check FAILED.
Mar  1 08:43:37 node2 kernel: block drbd4: error receiving Data, l: 4136!
Mar  1 08:43:37 node2 kernel: block drbd4: peer( Primary -> Unknown ) 
conn( Connected -> ProtocolError ) pdsk( UpToDate -> DUnknown ) 
Mar  1 08:43:37 node2 kernel: block drbd4: asender terminated
Mar  1 08:43:37 node2 kernel: block drbd4: Terminating asender thread
Mar  1 08:43:37 node2 kernel: block drbd4: Connection closed
Mar  1 08:43:37 node2 kernel: block drbd4: conn( ProtocolError -> 
Unconnected ) 
Mar  1 08:43:37 node2 kernel: block drbd4: receiver terminated
Mar  1 08:43:37 node2 kernel: block drbd4: Restarting receiver thread
Mar  1 08:43:37 node2 kernel: block drbd4: receiver (re)started
Mar  1 08:43:37 node2 kernel: block drbd4: conn( Unconnected -> 
WFConnection ) 
Mar  1 08:43:38 node2 kernel: block drbd4: Handshake successful: Agreed 
network protocol version 91
Mar  1 08:43:38 node2 kernel: block drbd4: conn( WFConnection -> 
WFReportParams ) 
Mar  1 08:43:38 node2 kernel: block drbd4: Starting asender thread (from 
drbd4_receiver [433])
Mar  1 08:43:38 node2 kernel: block drbd4: data-integrity-alg: md5
Mar  1 08:43:38 node2 kernel: block drbd4: drbd_sync_handshake:
Mar  1 08:43:38 node2 kernel: block drbd4: self 
4064274FA17AB19C:0000000000000000:00FFCB9ECAD3A6D4:50A168C54CD13071 bits:0 
flags:0
Mar  1 08:43:38 node2 kernel: block drbd4: peer 
0CC3D5A17A19D531:4064274FA17AB19D:00FFCB9ECAD3A6D4:50A168C54CD13071 
bits:122 flags:0
Mar  1 08:43:38 node2 kernel: block drbd4: uuid_compare()=-1 by rule 50
Mar  1 08:43:38 node2 kernel: block drbd4: peer( Unknown -> Primary ) 
conn( WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate ) 
Mar  1 08:43:38 node2 kernel: block drbd4: conn( WFBitMapT -> WFSyncUUID ) 

Mar  1 08:43:38 node2 kernel: block drbd4: helper command: /sbin/drbdadm 
before-resync-target minor-4
Mar  1 08:43:38 node2 kernel: block drbd4: helper command: /sbin/drbdadm 
before-resync-target minor-4 exit code 0 (0x0)
Mar  1 08:43:38 node2 kernel: block drbd4: conn( WFSyncUUID -> SyncTarget 
) disk( UpToDate -> Inconsistent ) 
Mar  1 08:43:38 node2 kernel: block drbd4: Began resync as SyncTarget 
(will sync 488 KB [122 bits set]).
Mar  1 08:43:38 node2 kernel: block drbd4: Resync done (total 1 sec; 
paused 0 sec; 488 K/sec)
Mar  1 08:43:38 node2 kernel: block drbd4: conn( SyncTarget -> Connected ) 
disk( Inconsistent -> UpToDate ) 
Mar  1 08:43:38 node2 kernel: block drbd4: helper command: /sbin/drbdadm 
after-resync-target minor-4
Mar  1 08:43:39 node2 kernel: block drbd4: helper command: /sbin/drbdadm 
after-resync-target minor-4 exit code 0 (0x0)


That system is running DRBD 8.3.4
8.3.7 and 8.3.8.1 show the same output and behaviour on the systems we run 
them on. OS is Sles11 / Sles11SP1

Mit freundlichen Grüßen / Best Regards

Robert Köppl

Systemadministration

KNAPP Systemintegration GmbH
Waltenbachstraße 9
8700 Leoben, Austria 
Phone: +43 3842 805-910
Fax: +43 3842 82930-500
[email protected] 
www.KNAPP.com 

Commercial register number: FN 138870x
Commercial register court: Leoben

The information in this e-mail (including any attachment) is confidential 
and intended to be for the use of the addressee(s) only. If you have 
received the e-mail by mistake, any disclosure, copy, distribution or use 
of the contents of the e-mail is prohibited, and you must delete the 
e-mail from your system. As e-mail can be changed electronically KNAPP 
assumes no responsibility for any alteration to this e-mail or its 
attachments. KNAPP has taken every reasonable precaution to ensure that 
any attachment to this e-mail has been swept for virus. However, KNAPP 
does not accept any liability for damage sustained as a result of such 
attachment being virus infected and strongly recommend that you carry out 
your own virus check before opening any attachment.



Lars Ellenberg <[email protected]> 
Gesendet von: [email protected]
28.02.2011 18:08
Bitte antworten an
General Linux-HA mailing list <[email protected]>


An
[email protected]
Kopie

Thema
Re: [Linux-HA] Antwort: Re:  DRBD BrokenPipe






On Mon, Feb 28, 2011 at 01:39:58PM +0100, [email protected] wrote:
> Lars Ellenberg gave some interesting information about this messages - 
at 
> least if you have verification of your Network traffic enabled:

Well, that's not much to do with "BrokenPipe",
but with "Digest mismatch", "Digest integrity check FAILED" etc.
And there has been no mention of that in the (much too short) log
excerpt shown.

So this may be something completely different.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to