On Wed, Jul 04, 2012 at 08:14:05PM +0200, Lutz Vieweg wrote:
> Hi,
>
> I just had to reboot a system that is configured as the "secondary" for 3
> DRBD devices.
> After the reboot, connection to the primary system was established and
> re-synchronisation started.
>
> Some scary messages were emitted during that process - on the primary:
>
> >block drbd0: uuid_compare()=1 by rule 70
> >block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS
> >) pdsk( DUnknown -> Consistent )
> >block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17),
> >total 68591; compression: 99.9%
> >block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE
> >68591(17), total 68591; compression: 99.9%
> >block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0
> >block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0 exit
> >code 0 (0x0)
> >block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Consistent ->
> >Inconsistent )
> >block drbd0: updated sync UUID 0AE...
> >block drbd0: Began resync as SyncSource (will sync 3242376 KB [810594 bits
> >set]).
> >block drbd0: BAD! sector=2513600376s enr=76708 rs_left=-15 rs_failed=0
> >count=32 cstate=SyncSource
> >block drbd0: BAD! sector=2342485912s enr=71486 rs_left=-19 rs_failed=0
> >count=32 cstate=SyncSource
> >block drbd0: BAD! sector=3393683440s enr=103566 rs_left=-30 rs_failed=0
> >count=32 cstate=SyncSource
> >block drbd0: BAD! sector=3062103824s enr=93447 rs_left=-2 rs_failed=0
> >count=32 cstate=SyncSource
> >block drbd0: Resync done (total 337 sec; paused 0 sec; 9620 K/sec)
> >block drbd0: updated UUIDs 0AE...
> >block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate )
> >block drbd0: bitmap WRITE of 0 pages took 1 jiffies
> >block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
>
> And on the (rebooted) secondary:
>
> >block drbd0: disk( Diskless -> Attaching )
> >block drbd0: max BIO size = 131072
> >block drbd0: drbd_bm_resize called with capacity == 3550894184
> >block drbd0: resync bitmap: bits=443861773 words=6935341 pages=13546
> >block drbd0: size = 1693 GB (1775447092 KB)
> >block drbd0: bitmap READ of 13546 pages took 1443 jiffies
> >block drbd0: recounting of set bits took additional 34 jiffies
> >block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
> >block drbd0: disk( Attaching -> UpToDate )
> >block drbd0: attached to UUIDs 9DE...
> >block drbd0: drbd_sync_handshake:
> >block drbd0: self 9DE... bits:0 flags:0
> >block drbd0: peer 0AE... bits:810473 flags:0
> >block drbd0: uuid_compare()=-1 by rule 50
> >block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT )
> >disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
> >block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE
> >68591(17), total 68591; compression: 99.9%
> >block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17),
> >total 68591; compression: 99.9%
> >block drbd0: conn( WFBitMapT -> WFSyncUUID )
> >block drbd0: updated sync uuid 9DE...
> >block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0
> >block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit
> >code 0 (0x0)
> >block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent
> >)
> >block drbd0: Began resync as SyncTarget (will sync 3242376 KB [810594 bits
> >set]).
> >block drbd0: BAD! sector=2513600376s enr=76708 rs_left=-15 rs_failed=0
> >count=32 cstate=SyncTarget
> >block drbd0: BAD! sector=2342485912s enr=71486 rs_left=-19 rs_failed=0
> >count=32 cstate=SyncTarget
> >block drbd0: BAD! sector=3393683440s enr=103566 rs_left=-30 rs_failed=0
> >count=32 cstate=SyncTarget
> >block drbd0: BAD! sector=3062103824s enr=93447 rs_left=-2 rs_failed=0
> >count=32 cstate=SyncTarget
> >block drbd0: Resync done (total 337 sec; paused 0 sec; 9620 K/sec)
> >block drbd0: updated UUIDs 0AE...
> >block drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
> >block drbd0: helper command: /sbin/drbdadm after-resync-target minor-0
> >block drbd0: helper command: /sbin/drbdadm after-resync-target minor-0 exit
> >code 0 (0x0)
> >block drbd0: bitmap WRITE of 0 pages took 1 jiffies
> >block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
>
> Now I wonder: What does drbd0 want to tell me with those "BAD! ..." messages?
It's just some reference counter that should not have gone negative,
but did, because we forgot to update/reinitialize it at some stage.
Depending on your exact DRBD version, I could tell you various things
about this. But if you run 8.3 git it is supposed to be fixed, finally...
> It seems to have completed the synchronization successfully. Also, no "read
> errors" where
> reported in on either host.
>
> Should I be concerned about the data integrity, now?
Nope. All good.
Cheers,
Lars
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user