CentOS 5.5, x86_64, drbd 8.3.8.

After running fine for a month, this started a couple of days ago on only one of three drbd volumes (the other two being fine):

Jan  8 09:54:53 tiger kernel: block drbd11: Began resync as SyncSource (will 
sync 1882624060 KB [470656015 bits set]).
Jan  8 09:54:57 tiger kernel: block drbd11: sock_recvmsg returned -104
Jan  8 09:54:57 tiger kernel: block drbd11: peer( Secondary -> Unknown ) conn( 
SyncSource -> NetworkFailure )
Jan  8 09:54:57 tiger kernel: block drbd11: asender terminated
Jan  8 09:54:57 tiger kernel: block drbd11: Terminating asender thread
Jan  8 09:54:57 tiger kernel: block drbd11: sock was shut down by peer
Jan  8 09:54:57 tiger kernel: block drbd11: short read expecting header on 
sock: r=0
Jan  8 09:54:57 tiger kernel: block drbd11: drbd_send_block/ack() failed
Jan  8 09:54:57 tiger kernel: block drbd11: Connection closed
Jan  8 09:54:57 tiger kernel: block drbd11: conn( NetworkFailure -> Unconnected 
)
Jan  8 09:54:57 tiger kernel: block drbd11: receiver terminated
Jan  8 09:54:57 tiger kernel: block drbd11: Restarting receiver thread
Jan  8 09:54:57 tiger kernel: block drbd11: receiver (re)started
Jan  8 09:54:57 tiger kernel: block drbd11: conn( Unconnected -> WFConnection )
Jan  8 09:54:57 tiger kernel: block drbd11: Handshake successful: Agreed 
network protocol version 94
Jan  8 09:54:57 tiger kernel: block drbd11: Peer authenticated using 20 bytes 
of 'sha1' HMAC
Jan  8 09:54:57 tiger kernel: block drbd11: conn( WFConnection -> 
WFReportParams )
Jan  8 09:54:57 tiger kernel: block drbd11: Starting asender thread (from 
drbd11_receiver [4148])
Jan  8 09:54:57 tiger kernel: block drbd11: data-integrity-alg: sha1
Jan  8 09:54:57 tiger kernel: block drbd11: drbd_sync_handshake:
Jan 8 09:54:57 tiger kernel: block drbd11: self 59D4BF73D5EC82B7:0CC8A25C011E7E57:1FF443C818621FD3:D319C97BC05715A4 bits:470570058 flags:0 Jan 8 09:54:57 tiger kernel: block drbd11: peer 0CC8A25C011E7E56:0000000000000000:3AEE8F0D28607364:01DF314E083700DD bits:470569999 flags:0
Jan  8 09:54:57 tiger kernel: block drbd11: uuid_compare()=1 by rule 70
Jan  8 09:54:57 tiger kernel: block drbd11: Becoming sync source due to disk 
states.
Jan  8 09:54:57 tiger kernel: block drbd11: peer( Unknown -> Secondary ) conn( 
WFReportParams -> WFBitMapS )
Jan  8 09:54:58 tiger kernel: block drbd11: conn( WFBitMapS -> SyncSource )

which repeats itself continuously. The resync gets to 0.1% and then starts over. The replication link is a dual bonded GbE point-to-point pair, and is in full operating condition. All other hardware is fine, and has been all along. Nothing has been changed. Can anyone give me a clue as to why this is happening?

Steve
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to