Hello,

On Mon, 10 Jan 2011, Holger Kiehl wrote:

Hello,

upgrading kernel on secondary from 2.6.36.2 to 2.6.37 gives me the
following error on primary:

   Jan 10 12:41:57 obelix kernel: block drbd0: BAD! BarrierAck #2350363662
   received, expected #2350363661!
   Jan 10 12:41:57 obelix kernel: block drbd0: peer( Secondary -> Unknown )
   conn( Connected -> ProtocolError ) pdsk( UpToDate -> DUnknown )
   Jan 10 12:41:57 obelix kernel: block drbd0: short read expecting header
   on sock: r=-512
   Jan 10 12:41:57 obelix kernel: block drbd0: Creating new current UUID
   Jan 10 12:41:57 obelix kernel: block drbd0: asender terminated
   Jan 10 12:41:57 obelix kernel: block drbd0: Terminating drbd0_asender
   Jan 10 12:41:57 obelix kernel: block drbd0: Connection closed
   Jan 10 12:41:57 obelix kernel: block drbd0: conn( ProtocolError ->
   Unconnected )
   Jan 10 12:41:57 obelix kernel: block drbd0: receiver terminated
   Jan 10 12:41:57 obelix kernel: block drbd0: Restarting drbd0_receiver
   Jan 10 12:41:57 obelix kernel: block drbd0: receiver (re)started
   Jan 10 12:41:57 obelix kernel: block drbd0: conn( Unconnected ->
   WFConnection )
   Jan 10 12:41:57 obelix kernel: block drbd0: Handshake successful: Agreed
   network protocol version 95
   Jan 10 12:41:57 obelix kernel: block drbd0: conn( WFConnection ->
   WFReportParams )
   Jan 10 12:41:57 obelix kernel: block drbd0: Starting asender thread
   (from drbd0_receiver [3233])
   Jan 10 12:41:57 obelix kernel: block drbd0: data-integrity-alg:
   <not-used>
   Jan 10 12:41:57 obelix kernel: block drbd0: max_segment_size ( = BIO
   size ) = 65536
   Jan 10 12:41:57 obelix kernel: block drbd0: drbd_sync_handshake:
   Jan 10 12:41:57 obelix kernel: block drbd0: self
   28DDE63A9DEC9869:19CC15BDDB81CF01:8C9904DC3E8DFFD7:F46F8C2F00547891
   bits:500 flags:0
   Jan 10 12:41:57 obelix kernel: block drbd0: peer
   19CC15BDDB81CF00:0000000000000000:8C9904DC3E8DFFD6:F46F8C2F00547891
   bits:0 flags:0
   Jan 10 12:41:57 obelix kernel: block drbd0: uuid_compare()=1 by rule 70
   Jan 10 12:41:57 obelix kernel: block drbd0: peer( Unknown -> Secondary )
   conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )

Upgrading the primary to 2.6.37 did also not help, it produces
the same errors. I tried this on two different clusters and
always the above error pops up if secondary is 2.6.37.

The same problem still exists when using kernel 2.6.38.1:

   Mar 25 08:54:20 obelix kernel: block drbd0: BAD! BarrierAck #1861867747 
received, expected #1861867746!
   Mar 25 08:54:20 obelix kernel: block drbd0: peer( Secondary -> Unknown ) conn( 
Connected -> ProtocolError ) pdsk( UpToDate -> DUnknown )
   Mar 25 08:54:20 obelix kernel: block drbd0: process_done_ee() = NOT_OK
   Mar 25 08:54:20 obelix kernel: block drbd0: asender terminated
   Mar 25 08:54:20 obelix kernel: block drbd0: Terminating drbd0_asender
   Mar 25 08:54:20 obelix kernel: block drbd0: short read expecting header on 
sock: r=-512
   Mar 25 08:54:20 obelix kernel: block drbd0: Creating new current UUID
   Mar 25 08:54:20 obelix kernel: block drbd0: Connection closed
   Mar 25 08:54:20 obelix kernel: block drbd0: conn( ProtocolError -> 
Unconnected )
   Mar 25 08:54:20 obelix kernel: block drbd0: receiver terminated
   Mar 25 08:54:20 obelix kernel: block drbd0: Restarting drbd0_receiver
   Mar 25 08:54:20 obelix kernel: block drbd0: receiver (re)started
   Mar 25 08:54:20 obelix kernel: block drbd0: conn( Unconnected -> 
WFConnection )
   Mar 25 08:54:20 obelix kernel: block drbd0: Handshake successful: Agreed 
network protocol version 94
   Mar 25 08:54:20 obelix kernel: block drbd0: conn( WFConnection -> 
WFReportParams )
   Mar 25 08:54:20 obelix kernel: block drbd0: Starting asender thread (from 
drbd0_receiver [3220])
   Mar 25 08:54:20 obelix kernel: block drbd0: data-integrity-alg: <not-used>
   Mar 25 08:54:20 obelix kernel: block drbd0: drbd_sync_handshake:
   Mar 25 08:54:20 obelix kernel: block drbd0: self 
840572B18801AA3B:F99A9CC7F9DDDB47:916E679DA4726603:830351EC828F2F13 bits:191 
flags:0
   Mar 25 08:54:20 obelix kernel: block drbd0: peer 
F99A9CC7F9DDDB46:0000000000000000:916E679DA4726602:830351EC828F2F13 bits:0 
flags:0
   Mar 25 08:54:20 obelix kernel: block drbd0: uuid_compare()=1 by rule 70
   Mar 25 08:54:20 obelix kernel: block drbd0: peer( Unknown -> Secondary ) conn( 
WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate )
   Mar 25 08:54:20 obelix kernel: block drbd0: conn( WFBitMapS -> SyncSource ) 
pdsk( UpToDate -> Inconsistent )
   Mar 25 08:54:20 obelix kernel: block drbd0: Began resync as SyncSource (will 
sync 764 KB [191 bits set]).
   Mar 25 08:54:21 obelix kernel: block drbd0: Resync done (total 1 sec; paused 
0 sec; 764 K/sec)
   Mar 25 08:54:21 obelix kernel: block drbd0: conn( SyncSource -> Connected ) 
pdsk( Inconsistent -> UpToDate )

And this then continues frequently:

  Mar 25 08:53:00 obelix kernel: block drbd0: BAD! BarrierAck #1224296926 
received, expected #1224296925!
  Mar 25 08:54:20 obelix kernel: block drbd0: BAD! BarrierAck #1861867747 
received, expected #1861867746!
  Mar 25 08:54:35 obelix kernel: block drbd0: BAD! BarrierAck #4040326970 
received, expected #4040326969!
  Mar 25 08:56:21 obelix kernel: block drbd0: BAD! BarrierAck #1235958129 
received, expected #1235958128!
  Mar 25 08:57:31 obelix kernel: block drbd0: BAD! BarrierAck #4096191267 
received, expected #4096191266!
  Mar 25 08:58:51 obelix kernel: block drbd0: BAD! BarrierAck #1578973016 
received, expected #1578973015!
  Mar 25 08:59:26 obelix kernel: block drbd0: BAD! BarrierAck #4131468500 
received, expected #4131468499!
  Mar 25 09:00:08 obelix kernel: block drbd0: BAD! BarrierAck #4013314144 
received, expected #4013314143!
  Mar 25 09:01:19 obelix kernel: block drbd0: BAD! BarrierAck #2538005992 
received, expected #2538005991!

Kernel 2.6.36.x is working without this problem. Any idea what is causing
this? What other information is required to solve this issue?

Regards,
Holger
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to