Hi,

today we had to reboot a machine that had an involuntary power-outage,
drbd version running (on both sides): 8.4.2rc1 (api:1/proto:86-101)
GIT-hash: 1c425b5af957cead7753f974d7c4dae737fd2b14
(the restarted machine had the role as "secondary" for 3 drbd devices).

The boot process stalled for about 5 minutes with drbd constantly
emitting messages like this:

d-con ResourceData2: BAD! BarrierAck #422870 received, expected #422875!
d-con ResourceData2: peer( Secondary -> Unknown ) conn( SyncSource -> 
ProtocolError )
d-con ResourceData2: asender terminated
d-con ResourceData2: Terminating asender thread
d-con ResourceData2: Connection closed
d-con ResourceData2: conn( ProtocolError -> Unconnected )
d-con ResourceData2: receiver terminated
d-con ResourceData2: Restarting receiver thread
d-con ResourceData2: receiver (re)started
d-con ResourceData2: conn( Unconnected -> WFConnection )
d-con ResourceData2: Handshake successful: Agreed network protocol version 101
d-con ResourceData2: conn( WFConnection -> WFReportParams )
d-con ResourceData2: Starting asender thread (from drbd_r_Resource [2998])

After those minutes the synchronization seemed to work fine, and the
system is now up and running.

I've read elsewhere that these messages might be "over-paranoid",
but if they hadn't stopped at some point, the boot procedure would
have stalled like forever.

Can this be fixed?

Regards,

Lutz Vieweg


_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to